HTTP/1.1 Server from Scratch

A multi-threaded HTTP/1.1 server built using pure Python and zero external dependencies — no Flask, Django, FastAPI, or http.server. Every layer of the stack is implemented by hand: the TCP acceptor loop, the byte-level protocol parser, the response serialiser, the URL router, the thread pool, and the observability layer.

Built as a deliberate exercise in understanding what web frameworks abstract away.

Features

Raw TCP socket server — socket.AF_INET / socket.SOCK_STREAM, SO_REUSEADDR
Byte-level HTTP/1.1 request parser (method, path, version, headers, body) — RFC 7230 compliant
HTTP response serialiser with correct Content-Length, Content-Type, and Connection: close headers
Decorator-based URL router with structured 404 / 500 fallbacks
Custom fixed-size thread pool — producer-consumer pattern, queue.Queue, poison-pill shutdown
Static file serving from the www/ directory
Thread-safe metrics collector — per-path and per-status request counters
/health liveness probe — uptime + UTC timestamp for orchestrators
/metrics instrumentation endpoint — request breakdown + live thread pool stats
Client socket timeout — defends against slow connections holding worker threads hostage
Graceful shutdown on SIGINT / SIGTERM — no leaked sockets or zombie threads
logging module throughout — zero print() statements
Full type hints on every function signature and class variable
25 unit tests covering the parser, router, and response serialiser
Docker support — non-root container user, built-in HEALTHCHECK, .dockerignore

Architecture

                         ┌─────────────────────────────────────┐
                         │             main.py                 │
                         │  (routes, logging config, startup)  │
                         └──────────────┬──────────────────────┘
                                        │
                         ┌──────────────▼──────────────────────┐
                         │            server.py                │
                         │  TCPServer — bind / listen / accept │
                         │  SO_REUSEADDR · 1 s accept timeout  │
                         └──────────────┬──────────────────────┘
                                        │  raw socket
                         ┌──────────────▼──────────────────────┐
                         │          thread_pool.py             │
                         │  ThreadPool — N worker threads      │
                         │  queue.Queue · poison-pill shutdown │
                         └──────────────┬──────────────────────┘
                                        │  task (closure)
                    ┌───────────────────┼───────────────────────┐
                    │                   │                       │
        ┌───────────▼──────┐ ┌──────────▼──────────┐ ┌────────▼──────────┐
        │   request.py     │ │     router.py        │ │   response.py     │
        │ parse_request()  │ │  Router.dispatch()   │ │ HTTPResponse      │
        │ bytes → dataclass│ │  path → handler func │ │ .to_bytes()       │
        └──────────────────┘ └─────────────────────-┘ └───────────────────┘
                                        │
                         ┌──────────────▼──────────────────────┐
                         │           metrics.py                │
                         │  MetricsCollector — thread-safe     │
                         │  request counters + uptime clock    │
                         └─────────────────────────────────────┘

Data flow for a single request:

Browser / curl
    │
    │  TCP connect → send raw HTTP bytes
    ▼
TCPServer.accept()              # main thread; returns immediately
    │
    ▼
ThreadPool.submit(task)         # non-blocking enqueue
    │
    ▼  (worker thread picks up task)
client_socket.settimeout(5.0)   # guard against slow / stalled clients
recv(4096)                      # read bytes off the socket
    │
    ▼
parse_request(raw_bytes)        # bytes → HTTPRequest dataclass
    │
    ▼
Router.dispatch(request)        # path → handler → HTTPResponse
    │
    ▼
metrics.record_request(...)     # thread-safe counter increment
    │
    ▼
HTTPResponse.to_bytes()         # HTTPResponse → wire-format bytes
    │
    ▼
socket.sendall(response)        # flush to client
socket.close()                  # Connection: close — one request per connection

Project Structure

http-server/
├── main.py             # Entry point — wires all components, defines routes
├── server.py           # TCPServer — raw socket, accept loop
├── thread_pool.py      # ThreadPool — fixed worker threads, queue-based dispatch
├── request.py          # HTTP/1.1 parser — bytes → HTTPRequest dataclass
├── response.py         # HTTP response builder — HTTPResponse + helper factories
├── router.py           # URL router — path registry, dispatch, static file handler
├── metrics.py          # MetricsCollector — thread-safe request counters + uptime
├── Dockerfile          # Container definition — non-root user, HEALTHCHECK
├── .dockerignore       # Excludes __pycache__, tests/, .env files from build context
├── www/
│   └── index.html      # Static HTML page served at GET /about
└── tests/
    └── test_request.py # 25 unit tests (parser · router · response)

Running Locally (Python)

Prerequisites

Python 3.9 or higher
No external packages — standard library only

Start the server

python main.py

Expected startup output:

2026-03-30T10:00:00 [DEBUG] metrics: MetricsCollector initialised
2026-03-30T10:00:00 [DEBUG] router: Registered route: / -> handle_home
2026-03-30T10:00:00 [DEBUG] router: Registered route: /api/hello -> handle_api_hello
2026-03-30T10:00:00 [DEBUG] router: Registered route: /about -> static:index.html
2026-03-30T10:00:00 [DEBUG] thread_pool: worker-0 started
...
2026-03-30T10:00:00 [INFO] thread_pool: ThreadPool started with 10 workers
2026-03-30T10:00:00 [INFO] __main__: Starting HTTP server on http://0.0.0.0:8080
2026-03-30T10:00:00 [INFO] __main__: Observability: http://localhost:8080/health  |  http://localhost:8080/metrics
2026-03-30T10:00:00 [INFO] server: Server listening on 0.0.0.0:8080

Stop the server

Press Ctrl-C. The server handles SIGINT, drains in-flight requests, and exits:

2026-03-30T10:00:05 [INFO] __main__: Signal 2 received — initiating graceful shutdown
2026-03-30T10:00:05 [INFO] thread_pool: ThreadPool shutting down (wait=True)...
2026-03-30T10:00:05 [INFO] thread_pool: ThreadPool shutdown complete
2026-03-30T10:00:05 [INFO] __main__: Server exited cleanly

Configuration

Edit the constants at the top of main.py:

Constant	Default	Description
`HOST`	`"0.0.0.0"`	Interface to bind to
`PORT`	`8080`	TCP port
`NUM_WORKERS`	`10`	Number of worker threads in the pool
`CLIENT_TIMEOUT`	`5.0`	Seconds before an idle client connection is dropped
`RECV_BUFFER`	`4096`	Bytes read per `recv()` call

Running with Docker

Prerequisites

Docker Desktop installed and running
No other dependencies — Python does not need to be installed on the host

Step 1 — Clone the repository

git clone https://github.com/ayan-04/http-server.git
cd http-server

Step 2 — Build the image

docker build -t http-server:latest .

What happens during the build (6 layers):

[1/6] FROM python:3.12-slim       → pull minimal Python base image (~50 MB)
[2/6] RUN addgroup / adduser      → create non-root system user (appuser)
[3/6] WORKDIR /app                → set working directory inside the container
[4/6] COPY *.py ./                → copy all Python source modules
[5/6] COPY www/ ./www/            → copy static files
[6/6] RUN chown -R appuser /app   → transfer file ownership to non-root user

Why COPY *.py instead of COPY . .? Explicit copying prevents accidentally baking .env files, credentials, or other sensitive files from the build context into the image — a common real-world security mistake.

Step 3 — Run the container

docker run -d \
  --name http-server \
  -p 8080:8080 \
  http-server:latest

Flag	Meaning
`-d`	Run in the background (detached mode)
`--name http-server`	Give the container a memorable name
`-p 8080:8080`	Map host port 8080 to container port 8080

Step 4 — Verify it is running

docker ps

CONTAINER ID   IMAGE                STATUS                    PORTS
72aedcc30081   http-server:latest   Up 10 seconds (healthy)   0.0.0.0:8080->8080/tcp

The (healthy) status means Docker's built-in HEALTHCHECK has successfully called GET /health and received a 200 OK response.

Note: The status reads (health: starting) for the first 10 seconds (the --start-period in the HEALTHCHECK instruction). This is normal.

Step 5 — Hit all endpoints

curl http://localhost:8080/            # plain-text home page
curl http://localhost:8080/api/hello   # JSON response
curl http://localhost:8080/about       # static HTML page
curl http://localhost:8080/health      # liveness probe
curl http://localhost:8080/metrics     # instrumentation

Viewing logs

# Tail live logs (Ctrl-C to stop following)
docker logs -f http-server

Each request produces two log lines — one from the router (INFO) and one from the response serialiser (DEBUG):

2026-03-30T10:00:06 [DEBUG] server: Accepted connection from 172.17.0.1:54320
2026-03-30T10:00:06 [DEBUG] request: Parsed request: GET /api/hello HTTP/1.1
2026-03-30T10:00:06 [INFO]  router: GET /api/hello 200
2026-03-30T10:00:06 [DEBUG] response: Sending response: 200 OK (133 bytes body)

Inspecting the Docker HEALTHCHECK

# Show health status and the last 5 probe results
docker inspect --format='{{json .State.Health}}' http-server | python -m json.tool

{
  "Status": "healthy",
  "FailingStreak": 0,
  "Log": [
    {
      "Start": "2026-03-30T10:00:40Z",
      "End":   "2026-03-30T10:00:40Z",
      "ExitCode": 0,
      "Output": ""
    }
  ]
}

Stopping the container

docker stop http-server

docker stop sends SIGTERM to the process. Because our CMD uses exec form (["python", "main.py"] — not a shell wrapper), main.py receives the signal directly and the graceful shutdown handler runs:

Signal 15 received — initiating graceful shutdown
ThreadPool shutting down (wait=True)...
ThreadPool shutdown complete
Server exited cleanly

Docker waits up to 10 seconds for a clean exit before sending SIGKILL. Our server typically exits in under 2 seconds.

Removing the container and image

docker rm http-server          # remove stopped container
docker rmi http-server:latest  # remove the image

Rebuild after code changes

docker stop http-server && docker rm http-server
docker build -t http-server:latest .
docker run -d --name http-server -p 8080:8080 http-server:latest

Override configuration at runtime

Port and worker count can be changed without rebuilding the image by passing environment variables. Add handling in main.py:

PORT        = int(os.environ.get("PORT", 8080))
NUM_WORKERS = int(os.environ.get("NUM_WORKERS", 10))

Then run with:

docker run -d --name http-server \
  -p 9000:9000 \
  -e PORT=9000 \
  -e NUM_WORKERS=20 \
  http-server:latest

Sharing the image without a registry

If you want to hand the exact built image to someone else without pushing to a registry:

# On your machine — export the image to a tar archive
docker save http-server:latest -o http-server.tar

# On the recipient's machine — load and run it
docker load -i http-server.tar
docker run -d --name http-server -p 8080:8080 http-server:latest

Docker image internals

Decision	Reason
`python:3.12-slim` base	~50 MB vs ~1 GB for the full image. No build tools needed since there are no C extensions.
Non-root `appuser`	Containers running as root give a compromised process full host access. A dedicated system user limits blast radius.
Explicit `COPY *.py`	Prevents secrets (`.env`, credentials) from being baked into the image inadvertently.
`HEALTHCHECK` via stdlib `urllib`	No `curl` in the slim image. Python stdlib covers the probe without adding any packages or image size.
`CMD ["python", "main.py"]` exec form	Exec form means the Python process is PID 1 and receives `SIGTERM` directly. Shell form (`CMD python main.py`) would make `sh` PID 1 and the signal would never reach the application.
`--start-period=10s`	Gives the server time to bind and start accepting before health probes begin — prevents false `unhealthy` marks on slow hosts.

API Endpoints

`GET /`

Plain-text home page listing all routes.

curl http://localhost:8080/

Welcome to the HTTP/1.1 server built from scratch!

Available routes:
  GET /          -> this page
  GET /api/hello -> JSON greeting
  GET /about     -> HTML static page

Observability:
  GET /health    -> liveness check (status + uptime)
  GET /metrics   -> request counters + thread pool stats

`GET /api/hello`

JSON response demonstrating structured data serialisation.

curl http://localhost:8080/api/hello

{
  "message": "Hello from the scratch-built HTTP server!",
  "method": "GET",
  "path": "/api/hello",
  "server": "python-raw/1.0"
}

`GET /about`

Static HTML page read from www/index.html with Content-Type: text/html.

curl http://localhost:8080/about
# or open in a browser

Error responses

Scenario	Status
Path not registered	`404 Not Found`
Malformed HTTP bytes	`400 Bad Request`
Handler raises an exception	`500 Internal Server Error`

Observability

The server exposes two dedicated observability endpoints. Neither endpoint is counted in the metrics — health-check probes should not inflate application traffic statistics.

`GET /health`

Purpose: Liveness probe. Used by Docker, Kubernetes, AWS ECS, and load balancers to decide whether this instance is alive and should receive traffic. A non-200 response triggers a restart or replacement.

curl http://localhost:8080/health

{
  "status": "ok",
  "uptime_seconds": 142.3,
  "timestamp": "2026-03-30T10:02:22Z"
}

Field	Description
`status`	Always `"ok"` while the process is running
`uptime_seconds`	Seconds since the server started — useful for detecting unexpected restarts
`timestamp`	UTC ISO-8601 — lets log aggregators correlate health events with application logs

`GET /metrics`

Purpose: Instrumentation endpoint. Shows request volume, error rates, and live thread pool state. A persistently non-zero queue_depth is a signal that NUM_WORKERS needs to be increased.

curl http://localhost:8080/metrics

{
  "uptime_seconds": 142.8,
  "requests_total": 58,
  "requests_by_status": {
    "200": 55,
    "404": 3
  },
  "requests_by_path": {
    "/api/hello": 40,
    "/": 15,
    "/nope": 3
  },
  "thread_pool": {
    "total_workers": 10,
    "active_workers": 2,
    "idle_workers": 8,
    "queue_depth": 0
  }
}

Field	Description
`requests_total`	Total requests handled since startup (excludes `/health` and `/metrics`)
`requests_by_status`	Breakdown by HTTP status code — fast way to spot error spikes
`requests_by_path`	Sorted highest-count first — shows which routes bear the most load
`thread_pool.active_workers`	Workers currently executing a task
`thread_pool.idle_workers`	Workers blocked on `queue.get()` waiting for work
`thread_pool.queue_depth`	Requests waiting for a free thread — non-zero = pool is undersized

Running Tests

The test suite uses Python's built-in unittest — no external test runner needed.

python -m unittest tests/test_request.py -v

Expected output:

test_404_reason_phrase (TestHTTPResponse) ... ok
test_blank_line_before_body (TestHTTPResponse) ... ok
test_connection_close_header_present (TestHTTPResponse) ... ok
test_content_length_header_matches_body (TestHTTPResponse) ... ok
test_status_line_format (TestHTTPResponse) ... ok
test_no_exception_on_arbitrary_garbage (TestParseRequest) ... ok
test_parses_headers_into_lowercase_dict (TestParseRequest) ... ok
test_parses_http_version (TestParseRequest) ... ok
test_parses_method_from_valid_get_request (TestParseRequest) ... ok
test_parses_nested_path (TestParseRequest) ... ok
test_parses_path_from_valid_get_request (TestParseRequest) ... ok
test_parses_post_request_with_body (TestParseRequest) ... ok
test_parses_root_path (TestParseRequest) ... ok
test_returns_400_for_empty_bytes (TestParseRequest) ... ok
test_returns_400_for_garbage_bytes (TestParseRequest) ... ok
test_returns_400_for_lowercase_method (TestParseRequest) ... ok
test_returns_400_for_malformed_request_line (TestParseRequest) ... ok
test_returns_400_for_missing_slash_in_path (TestParseRequest) ... ok
test_returns_400_when_header_separator_missing (TestParseRequest) ... ok
test_404_body_contains_path (TestRouter) ... ok
test_add_route_registers_handler (TestRouter) ... ok
test_dispatches_to_registered_route (TestRouter) ... ok
test_handler_exception_returns_500 (TestRouter) ... ok
test_returns_404_for_root_when_not_registered (TestRouter) ... ok
test_returns_404_for_unknown_path (TestRouter) ... ok

Ran 25 tests in 0.018s

OK

Test coverage by module

Test Suite	What is tested
`TestParseRequest`	Valid GET/POST, header normalisation, malformed input, garbage bytes
`TestRouter`	Dispatch, 404 fallback, 500 on handler exception, `add_route`
`TestHTTPResponse`	Status line format, `Content-Length`, blank-line separator, reason phrases

Design Decisions

Every non-obvious choice in the codebase is documented. The most important ones:

`SO_REUSEADDR` on the server socket

self._server_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

Without this, after a crash or fast restart the OS keeps the port in TIME_WAIT for ~60 seconds and every restart fails with "Address already in use". Setting SO_REUSEADDR lets the server rebind immediately.

`accept()` timeout of 1 second

self._server_socket.settimeout(1.0)

accept() blocks the thread indefinitely waiting for a connection. Without a timeout, Ctrl-C is silently swallowed. A 1-second timeout lets the accept loop check the _running flag regularly, making shutdown always responsive.

Client socket timeout

client_socket.settimeout(CLIENT_TIMEOUT)  # default: 5.0 s

A client that opens a TCP connection but never sends data holds a worker thread hostage until it eventually disconnects — a basic form of resource exhaustion (slow-loris style). The per-client timeout releases the thread after 5 seconds regardless of client behaviour.

Poison-pill shutdown pattern

for _ in self._workers:
    self._task_queue.put(None)  # one sentinel per thread

Workers block on queue.get() at zero CPU cost when idle. Sending exactly one None per worker guarantees each thread wakes up and exits exactly once — even if some workers are currently processing tasks. This is simpler and more reliable than a shared threading.Event because it travels through the same queue as real work.

Thread-safe active worker tracking

with self._workers_lock:
    self._active_workers += 1
try:
    task()
finally:
    with self._workers_lock:
        self._active_workers -= 1

The finally clause guarantees the count is decremented even if the task raises. The lock window is microseconds — only the counter increment/decrement, not the task itself. This keeps lock contention negligible while giving the /metrics endpoint accurate live data.

Header names normalised to lowercase

headers[name.strip().lower()] = value.strip()

RFC 7230 §3.2 states HTTP header names are case-insensitive. Normalising at parse time means every handler uses request.headers["content-type"] without worrying about Content-Type vs CONTENT-TYPE vs content-type.

`error` field on `HTTPRequest` instead of raising exceptions

@dataclass
class HTTPRequest:
    error: Optional[int] = None

The parser never raises. A malformed request returns an HTTPRequest with error=400. This keeps error handling at the call site rather than scattered across try/except blocks in the call stack, and makes the parser trivial to unit test.

Dependency injection for `TCPServer`

server = TCPServer(host=HOST, port=PORT, connection_handler=enqueue_connection)

TCPServer knows nothing about threading or HTTP. The handler is injected at construction, so the same server can accept a synchronous handler in tests or swap the thread pool for a process pool without touching server.py.

Metrics excluded from `/health` and `/metrics` probes

_INTERNAL_PATHS: frozenset[str] = frozenset({"/health", "/metrics"})

Docker and Kubernetes send a health probe every 30 seconds. Without exclusion, those probes would dwarf real application traffic in the requests_by_path breakdown and make the metrics meaningless. Internal paths are filtered silently at record time.

Intentional Omissions

Feature	Why omitted
Keep-Alive / persistent connections	Requires tracking `Content-Length` or `Transfer-Encoding: chunked` to find request boundaries. Significant complexity for minimal gain at this scale.
Chunked transfer encoding	Not needed when every response has a known size at serialisation time.
URL query-string parsing	Out of scope; none of the demo routes use query parameters.
CORS headers	Application-layer concern; belongs in handlers, not the server core.
HTTPS / TLS	Would require `ssl.wrap_socket()` and certificate management, obscuring the core HTTP concepts.
Async I/O (`asyncio`)	Threading is more readable for an educational implementation. Async is the right choice for 10,000+ concurrent connections — out of scope here.

Concepts Demonstrated

Concept	Where
TCP socket programming	`server.py`
HTTP/1.1 wire protocol (RFC 7230 / 7231)	`request.py`, `response.py`
Producer-consumer concurrency	`thread_pool.py`
Thread-safe shared state with `threading.Lock`	`metrics.py`, `thread_pool.py`
Separation of concerns	Each module has a single responsibility
Dependency injection	`TCPServer(connection_handler=...)`
Graceful shutdown (OS signal handling)	`main.py`
Defensive parsing — never crash on bad input	`request.py`
Structured logging / observability	All modules via `logging`
Liveness and readiness probes	`/health`, `/metrics` endpoints
Type safety — full type hints	All function signatures and class variables
Unit testing without mocking frameworks	`tests/test_request.py`
Containerisation — non-root, HEALTHCHECK, `.dockerignore`	`Dockerfile`

Standards and References

RFC 7230 — HTTP/1.1 Message Syntax and Routing
RFC 7231 — HTTP/1.1 Semantics and Content
Python socket documentation
Python threading documentation
Docker best practices — non-root users
Docker HEALTHCHECK reference

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
tests		tests
www		www
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
main.py		main.py
metrics.py		metrics.py
request.py		request.py
response.py		response.py
router.py		router.py
server.py		server.py
thread_pool.py		thread_pool.py

Folders and files

Latest commit

History

Repository files navigation

HTTP/1.1 Server from Scratch

Table of Contents

Features

Architecture

Project Structure

Running Locally (Python)

Prerequisites

Start the server

Stop the server

Configuration

Running with Docker

Prerequisites

Step 1 — Clone the repository

Step 2 — Build the image

Step 3 — Run the container

Step 4 — Verify it is running

Step 5 — Hit all endpoints

Viewing logs

Inspecting the Docker HEALTHCHECK

Stopping the container

Removing the container and image

Rebuild after code changes

Override configuration at runtime

Sharing the image without a registry

Docker image internals

API Endpoints

GET /

GET /api/hello

GET /about

Error responses

Observability

GET /health

GET /metrics

Running Tests

Test coverage by module

Design Decisions

SO_REUSEADDR on the server socket

accept() timeout of 1 second

Client socket timeout

Poison-pill shutdown pattern

Thread-safe active worker tracking

Header names normalised to lowercase

error field on HTTPRequest instead of raising exceptions

Dependency injection for TCPServer

Metrics excluded from /health and /metrics probes

Intentional Omissions

Concepts Demonstrated

Standards and References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`GET /`

`GET /api/hello`

`GET /about`

`GET /health`

`GET /metrics`

`SO_REUSEADDR` on the server socket

`accept()` timeout of 1 second

`error` field on `HTTPRequest` instead of raising exceptions

Dependency injection for `TCPServer`

Metrics excluded from `/health` and `/metrics` probes

Packages