A small, dependency-free static site generator written from scratch in Python.
It reads Markdown files from content/, converts them into HTML using a hand-written Markdown parser, wraps them in an HTML template, and writes the finished site into docs/ (ready to be served by GitHub Pages or any static host).
This project was built as part of the Boot.dev "Build a Static Site Generator in Python" course. The live demo site (the "Tolkien Fan Club") is deployed to GitHub Pages at:
- What is a "static site generator"?
- Features
- Project structure
- Quick start
- How it works (high-level data flow)
- Architecture in depth
- Supported Markdown syntax
- Adding your own content
- Building and deploying to GitHub Pages
- Testing
- Glossary for beginners
- Credits
A static site generator (SSG) is a program that takes:
- some easy-to-write source files (here, Markdown), and
- a layout/template (here, a single
template.html),
and produces a folder full of plain HTML/CSS/image files.
Those files are "static" because no server-side code runs when a visitor loads the page — the browser just downloads the pre-built HTML. That makes the site fast, cheap to host, and easy to deploy on services like GitHub Pages, Netlify, or Cloudflare Pages.
Popular SSGs include Hugo, Jekyll, and Eleventy. This project builds a tiny one from scratch so you can see exactly how each step works.
- Pure Python 3 — no third-party libraries required (uses only the standard library).
- Hand-written Markdown parser that supports headings, paragraphs, bold, italic, inline code, code blocks, blockquotes, ordered/unordered lists, links, and images.
- Recursively walks
content/so the site's URL structure mirrors the folder structure. - Copies a
static/folder (CSS, images) verbatim into the output. - Configurable basepath so the same code can be served at
/locally and at/boots-static/on GitHub Pages. - Full unit-test suite using Python's built-in
unittest.
boots-static/
├── build.sh # Build for GitHub Pages (uses /boots-static/ basepath)
├── main.sh # Build for local dev + start a local server on :8888
├── test.sh # Run the unit-test suite
├── template.html # The single HTML layout used for every page
│
├── content/ # Markdown source files (your site's pages)
│ ├── index.md
│ ├── contact/index.md
│ └── blog/
│ ├── glorfindel/index.md
│ ├── majesty/index.md
│ └── tom/index.md
│
├── static/ # Files copied verbatim into the output
│ ├── index.css
│ └── images/
│ ├── glorfindel.png
│ ├── rivendell.png
│ ├── tolkien.png
│ └── tom.png
│
├── docs/ # Build output — deployed to GitHub Pages (auto-generated)
│
└── src/ # The static site generator itself
├── main.py # Entry point: orchestrates the build
├── copystatic.py # Recursively copies static/ -> docs/
├── gencontent.py # Recursively converts content/*.md -> docs/*.html
├── textnode.py # TextNode: an inline piece of text (bold, link, etc.)
├── htmlnode.py # HTMLNode / LeafNode / ParentNode: the HTML tree
├── inline_markdown.py # Inline parser: bold, italic, code, links, images
├── block_markdown.py # Block parser: headings, lists, quotes, paragraphs
└── test_*.py # Unit tests for each module
You need Python 3 installed (no pip install required — it's pure stdlib).
git clone https://github.com/searse/boots-static.git
cd boots-static
./main.shmain.sh will:
- Build the site into
docs/. - Start a local web server in
docs/at http://localhost:8888.
Open the URL in your browser and you'll see the demo site.
When you run python3 src/main.py, four things happen in order:
┌───────────────┐
│ template.html│
└──────┬────────┘
│
static/ ──copy──▶ docs/ (CSS + images)
▲
content/ ──parse──▶ ║ ──fill template──▶ docs/**/*.html
(Markdown) ║
║
(Markdown → HTML
via tiny custom parser)
Step by step:
- Wipe
docs/— guarantees a clean build. - Copy
static/→docs/— CSS and images are static assets and pass straight through (copystatic.py). - Walk
content/recursively — for every*.mdfile found, generate a matching*.htmlfile at the same relative path insidedocs/(gencontent.py). - For each Markdown file:
- Parse the Markdown into an in-memory tree of HTML nodes.
- Render the tree to an HTML string.
- Extract the first
# Headingand use it as the page<title>. - Substitute
{{ Title }}and{{ Content }}intemplate.html. - Rewrite root-relative URLs (
/foo) to use the configured basepath (e.g./boots-static/foo) so links work on GitHub Pages. - Write the result to disk.
The interesting part of this project is the Markdown-to-HTML pipeline. It's split into two layers — an HTML node model and a two-pass parser — that work together. Here is how each piece fits.
Rather than building an HTML string character by character, the generator first builds a tree of objects that represents the page. This is essentially a tiny DOM.
There are three classes:
HTMLNode— the abstract base class. Stores a tag ("p","h1", ...), an optional value (text content), optional children, and optional props (HTML attributes likehreforsrc).LeafNode— an HTML node with no children, only a text value. Used for<b>,<i>,<a>,<img>, or plain text.ParentNode— an HTML node that contains other nodes as children. Used for<p>,<ul>,<h2>,<blockquote>, etc.
Every node knows how to render itself as HTML via to_html(). A ParentNode does this by recursively calling to_html() on each of its children and concatenating the results — a classic tree-walking pattern.
ParentNode("p", [
LeafNode(None, "Hello, "),
LeafNode("b", "world"),
LeafNode(None, "!"),
]).to_html()
# -> "<p>Hello, <b>world</b>!</p>"Inside a paragraph, text is a mix of plain words, bold spans, italics, code, links, and images. To make parsing tractable, the generator first represents these inline elements as TextNode objects — a flat list of typed text spans:
class TextType(Enum):
TEXT, BOLD, ITALIC, CODE, LINK, IMAGE = ...
TextNode("hello", TextType.TEXT)
TextNode("world", TextType.BOLD)
TextNode("Boot.dev", TextType.LINK, url="https://boot.dev")text_node_to_html_node() then converts each TextNode into the appropriate LeafNode (e.g. BOLD → <b>, LINK → <a href="…">). This separation keeps the parser simple: it only worries about what kind of span each piece of text is, not about HTML tags or attributes.
This module turns a raw line of Markdown into a list of TextNodes by applying a series of splitters:
split_nodes_delimiter— splits by paired delimiters (**bold**,_italic_,`code`). It walks each existing text node, splits on the delimiter, and tags the odd-indexed chunks as the new text type.split_nodes_image— uses a regex to findand replaces them withIMAGEnodes.split_nodes_link— uses a regex to find[text](url)(skipping ones preceded by!) and replaces them withLINKnodes.
text_to_textnodes() chains these together in the right order. Once a node has been tagged as anything other than TEXT, later splitters leave it alone — that's how nested delimiters are avoided.
Markdown is also organized into blocks separated by blank lines (a paragraph, a list, a code fence, etc.). This module:
markdown_to_blocks()— splits the whole document on\n\nand trims whitespace.block_to_block_type()— inspects each block and classifies it as one ofHEADING,CODE,QUOTE,UNORDERED_LIST,ORDERED_LIST, orPARAGRAPH.- A dedicated
*_to_html_nodefunction for each block type builds the rightParentNode(<h2>,<ul>,<pre><code>,<blockquote>,<p>, …), using the inline parser to fill in its children. markdown_to_html_node()wraps every block in a top-level<div>and returns it.
The end result of calling markdown_to_html_node(my_markdown).to_html() is a complete HTML fragment.
template.html is a minimal layout with two placeholders:
<title>{{ Title }}</title>
...
<article>{{ Content }}</article>For each content/**/*.md file:
extract_title()finds the first# H1heading in the Markdown and uses it as the title (raising an error if missing — every page is required to have one).markdown_to_html_node()produces the HTML body.- The placeholders are replaced.
- All root-relative URLs (
href="/...",src="/...") are rewritten to start with the configured basepath (e.g./boots-static/...) so the site works under a GitHub Pages subpath. Locally, the basepath defaults to/.
generate_pages_recursive() walks the content/ tree, mirroring the directory layout into docs/, swapping .md for .html on the way out.
Before content is generated, copy_files_recursive() deletes docs/ and copies the entire static/ tree into it. Anything in static/ — CSS, images, fonts, downloads — lands at the same relative path in the output.
main.py orchestrates the whole build:
basepath = sys.argv[1] if len(sys.argv) > 1 else "/"
shutil.rmtree("./docs", ignore_errors=True)
copy_files_recursive("./static", "./docs")
generate_pages_recursive("./content", "./template.html", "./docs", basepath)That's the entire program in three lines of logic.
The hand-written parser supports the following constructs:
| Markdown | HTML output |
|---|---|
# Heading 1 … ###### Heading 6 |
<h1>…</h1> … <h6>…</h6> |
| Paragraph text | <p>…</p> |
**bold** |
<b>bold</b> |
_italic_ |
<i>italic</i> |
`code` |
<code>code</code> |
| Triple-backtick block | <pre><code>…</code></pre> |
> quoted line (one or more lines) |
<blockquote>…</blockquote> |
- item lines |
<ul><li>…</li></ul> |
1. item, 2. item, … |
<ol><li>…</li></ol> |
[text](url) |
<a href="url">text</a> |
 |
<img src="url" alt="alt"> |
Blocks are separated by blank lines. Headings use # (with a trailing space) and must be 1–6 hashes. Ordered lists must start at 1. and increment by one.
This is a teaching project, so the parser is intentionally small. It does not implement the full CommonMark spec — features like nested lists, reference-style links, HTML passthrough, tables, or footnotes are not supported.
-
Create a new folder under
content/, e.g.content/about/. -
Put an
index.mdinside it. The file must contain a top-level# Heading— that becomes the page title. -
Link to your new page from any other Markdown file using a root-relative path:
[About me](/about)
-
Drop any images you want into
static/images/and reference them with. -
Run
./main.shand refresh your browser.
Folders become URL paths automatically. For example, content/blog/tom/index.md is rendered to docs/blog/tom/index.html, served at /blog/tom/.
There are two build scripts:
main.sh— builds with the default basepath (/) and serves it locally on port 8888. Use this while developing.build.sh— builds with basepath/boots-static/, which matches the URL prefix used by GitHub Pages for this repo (https://<user>.github.io/<repo>/). Use this before pushing for deployment.
This repo is configured to publish the docs/ folder on the main branch to GitHub Pages. Deploy flow:
./build.sh # generates docs/ with the /boots-static/ basepath
git add docs
git commit -m "Deploy: rebuild site"
git pushWithin a minute or so, GitHub Pages will pick up the new docs/ content and publish it.
If you fork this project under a different repo name, change the basepath in build.sh to match your own repo (/<your-repo-name>/), or just leave it / if you're deploying to a root domain.
All non-trivial modules have unit tests using Python's built-in unittest framework. To run them:
./test.sh
# equivalent to:
python3 -m unittest discover -s srcThe test files (src/test_*.py) are great places to look if you want concrete examples of how each module is meant to be called.
- Static site — A website made of pre-built HTML/CSS/JS files. No database or server-side code runs per request.
- Markdown — A lightweight plain-text format that's easy to write and converts cleanly to HTML.
- AST (abstract syntax tree) — An in-memory tree representation of parsed content. Here, the
HTMLNodetree is the AST of a page. - Leaf node — A tree node with no children (just a value).
- Parent node — A tree node that contains other nodes.
- Recursion — A function calling itself to process tree-shaped or nested data. Used here both for walking the
content/folder and for rendering nested HTML nodes. - Template / templating — Filling placeholders in a layout file (
{{ Title }},{{ Content }}) with real values. - Basepath — A URL prefix the whole site lives under. GitHub Pages serves project sites from
/<repo>/, so links need to be rewritten to include that prefix. - GitHub Pages — A free static-hosting service built into GitHub that serves files directly from a branch/folder of your repository.
- Built for the Boot.dev course "Build a Static Site Generator in Python."
- Demo content ("Tolkien Fan Club") is the course's sample site; feel free to replace it with your own.