A multi-threaded HTTP/1.1 server built using pure Python and zero external dependencies — no Flask, Django, FastAPI, or http.server. Every layer of the stack is implemented by hand: the TCP acceptor loop, the byte-level protocol parser, the response serialiser, the URL router, the thread pool, and the observability layer.
Built as a deliberate exercise in understanding what web frameworks abstract away.
- Features
- Architecture
- Project Structure
- Running Locally (Python)
- Running with Docker
- API Endpoints
- Observability
- Running Tests
- Design Decisions
- Intentional Omissions
- Concepts Demonstrated
- Standards and References
- Raw TCP socket server —
socket.AF_INET/socket.SOCK_STREAM,SO_REUSEADDR - Byte-level HTTP/1.1 request parser (method, path, version, headers, body) — RFC 7230 compliant
- HTTP response serialiser with correct
Content-Length,Content-Type, andConnection: closeheaders - Decorator-based URL router with structured 404 / 500 fallbacks
- Custom fixed-size thread pool — producer-consumer pattern,
queue.Queue, poison-pill shutdown - Static file serving from the
www/directory - Thread-safe metrics collector — per-path and per-status request counters
/healthliveness probe — uptime + UTC timestamp for orchestrators/metricsinstrumentation endpoint — request breakdown + live thread pool stats- Client socket timeout — defends against slow connections holding worker threads hostage
- Graceful shutdown on
SIGINT/SIGTERM— no leaked sockets or zombie threads loggingmodule throughout — zeroprint()statements- Full type hints on every function signature and class variable
- 25 unit tests covering the parser, router, and response serialiser
- Docker support — non-root container user, built-in
HEALTHCHECK,.dockerignore
┌─────────────────────────────────────┐
│ main.py │
│ (routes, logging config, startup) │
└──────────────┬──────────────────────┘
│
┌──────────────▼──────────────────────┐
│ server.py │
│ TCPServer — bind / listen / accept │
│ SO_REUSEADDR · 1 s accept timeout │
└──────────────┬──────────────────────┘
│ raw socket
┌──────────────▼──────────────────────┐
│ thread_pool.py │
│ ThreadPool — N worker threads │
│ queue.Queue · poison-pill shutdown │
└──────────────┬──────────────────────┘
│ task (closure)
┌───────────────────┼───────────────────────┐
│ │ │
┌───────────▼──────┐ ┌──────────▼──────────┐ ┌────────▼──────────┐
│ request.py │ │ router.py │ │ response.py │
│ parse_request() │ │ Router.dispatch() │ │ HTTPResponse │
│ bytes → dataclass│ │ path → handler func │ │ .to_bytes() │
└──────────────────┘ └─────────────────────-┘ └───────────────────┘
│
┌──────────────▼──────────────────────┐
│ metrics.py │
│ MetricsCollector — thread-safe │
│ request counters + uptime clock │
└─────────────────────────────────────┘
Data flow for a single request:
Browser / curl
│
│ TCP connect → send raw HTTP bytes
▼
TCPServer.accept() # main thread; returns immediately
│
▼
ThreadPool.submit(task) # non-blocking enqueue
│
▼ (worker thread picks up task)
client_socket.settimeout(5.0) # guard against slow / stalled clients
recv(4096) # read bytes off the socket
│
▼
parse_request(raw_bytes) # bytes → HTTPRequest dataclass
│
▼
Router.dispatch(request) # path → handler → HTTPResponse
│
▼
metrics.record_request(...) # thread-safe counter increment
│
▼
HTTPResponse.to_bytes() # HTTPResponse → wire-format bytes
│
▼
socket.sendall(response) # flush to client
socket.close() # Connection: close — one request per connection
http-server/
├── main.py # Entry point — wires all components, defines routes
├── server.py # TCPServer — raw socket, accept loop
├── thread_pool.py # ThreadPool — fixed worker threads, queue-based dispatch
├── request.py # HTTP/1.1 parser — bytes → HTTPRequest dataclass
├── response.py # HTTP response builder — HTTPResponse + helper factories
├── router.py # URL router — path registry, dispatch, static file handler
├── metrics.py # MetricsCollector — thread-safe request counters + uptime
├── Dockerfile # Container definition — non-root user, HEALTHCHECK
├── .dockerignore # Excludes __pycache__, tests/, .env files from build context
├── www/
│ └── index.html # Static HTML page served at GET /about
└── tests/
└── test_request.py # 25 unit tests (parser · router · response)
- Python 3.9 or higher
- No external packages — standard library only
python main.pyExpected startup output:
2026-03-30T10:00:00 [DEBUG] metrics: MetricsCollector initialised
2026-03-30T10:00:00 [DEBUG] router: Registered route: / -> handle_home
2026-03-30T10:00:00 [DEBUG] router: Registered route: /api/hello -> handle_api_hello
2026-03-30T10:00:00 [DEBUG] router: Registered route: /about -> static:index.html
2026-03-30T10:00:00 [DEBUG] thread_pool: worker-0 started
...
2026-03-30T10:00:00 [INFO] thread_pool: ThreadPool started with 10 workers
2026-03-30T10:00:00 [INFO] __main__: Starting HTTP server on http://0.0.0.0:8080
2026-03-30T10:00:00 [INFO] __main__: Observability: http://localhost:8080/health | http://localhost:8080/metrics
2026-03-30T10:00:00 [INFO] server: Server listening on 0.0.0.0:8080
Press Ctrl-C. The server handles SIGINT, drains in-flight requests, and exits:
2026-03-30T10:00:05 [INFO] __main__: Signal 2 received — initiating graceful shutdown
2026-03-30T10:00:05 [INFO] thread_pool: ThreadPool shutting down (wait=True)...
2026-03-30T10:00:05 [INFO] thread_pool: ThreadPool shutdown complete
2026-03-30T10:00:05 [INFO] __main__: Server exited cleanly
Edit the constants at the top of main.py:
| Constant | Default | Description |
|---|---|---|
HOST |
"0.0.0.0" |
Interface to bind to |
PORT |
8080 |
TCP port |
NUM_WORKERS |
10 |
Number of worker threads in the pool |
CLIENT_TIMEOUT |
5.0 |
Seconds before an idle client connection is dropped |
RECV_BUFFER |
4096 |
Bytes read per recv() call |
- Docker Desktop installed and running
- No other dependencies — Python does not need to be installed on the host
git clone https://github.com/ayan-04/http-server.git
cd http-serverdocker build -t http-server:latest .What happens during the build (6 layers):
[1/6] FROM python:3.12-slim → pull minimal Python base image (~50 MB)
[2/6] RUN addgroup / adduser → create non-root system user (appuser)
[3/6] WORKDIR /app → set working directory inside the container
[4/6] COPY *.py ./ → copy all Python source modules
[5/6] COPY www/ ./www/ → copy static files
[6/6] RUN chown -R appuser /app → transfer file ownership to non-root user
Why
COPY *.pyinstead ofCOPY . .? Explicit copying prevents accidentally baking.envfiles, credentials, or other sensitive files from the build context into the image — a common real-world security mistake.
docker run -d \
--name http-server \
-p 8080:8080 \
http-server:latest| Flag | Meaning |
|---|---|
-d |
Run in the background (detached mode) |
--name http-server |
Give the container a memorable name |
-p 8080:8080 |
Map host port 8080 to container port 8080 |
docker psCONTAINER ID IMAGE STATUS PORTS
72aedcc30081 http-server:latest Up 10 seconds (healthy) 0.0.0.0:8080->8080/tcp
The (healthy) status means Docker's built-in HEALTHCHECK has successfully
called GET /health and received a 200 OK response.
Note: The status reads
(health: starting)for the first 10 seconds (the--start-periodin theHEALTHCHECKinstruction). This is normal.
curl http://localhost:8080/ # plain-text home page
curl http://localhost:8080/api/hello # JSON response
curl http://localhost:8080/about # static HTML page
curl http://localhost:8080/health # liveness probe
curl http://localhost:8080/metrics # instrumentation# Tail live logs (Ctrl-C to stop following)
docker logs -f http-serverEach request produces two log lines — one from the router (INFO) and one from the response serialiser (DEBUG):
2026-03-30T10:00:06 [DEBUG] server: Accepted connection from 172.17.0.1:54320
2026-03-30T10:00:06 [DEBUG] request: Parsed request: GET /api/hello HTTP/1.1
2026-03-30T10:00:06 [INFO] router: GET /api/hello 200
2026-03-30T10:00:06 [DEBUG] response: Sending response: 200 OK (133 bytes body)
# Show health status and the last 5 probe results
docker inspect --format='{{json .State.Health}}' http-server | python -m json.tool{
"Status": "healthy",
"FailingStreak": 0,
"Log": [
{
"Start": "2026-03-30T10:00:40Z",
"End": "2026-03-30T10:00:40Z",
"ExitCode": 0,
"Output": ""
}
]
}docker stop http-serverdocker stop sends SIGTERM to the process. Because our CMD uses exec form
(["python", "main.py"] — not a shell wrapper), main.py receives the signal
directly and the graceful shutdown handler runs:
Signal 15 received — initiating graceful shutdown
ThreadPool shutting down (wait=True)...
ThreadPool shutdown complete
Server exited cleanly
Docker waits up to 10 seconds for a clean exit before sending SIGKILL. Our
server typically exits in under 2 seconds.
docker rm http-server # remove stopped container
docker rmi http-server:latest # remove the imagedocker stop http-server && docker rm http-server
docker build -t http-server:latest .
docker run -d --name http-server -p 8080:8080 http-server:latestPort and worker count can be changed without rebuilding the image by passing
environment variables. Add handling in main.py:
PORT = int(os.environ.get("PORT", 8080))
NUM_WORKERS = int(os.environ.get("NUM_WORKERS", 10))Then run with:
docker run -d --name http-server \
-p 9000:9000 \
-e PORT=9000 \
-e NUM_WORKERS=20 \
http-server:latestIf you want to hand the exact built image to someone else without pushing to a registry:
# On your machine — export the image to a tar archive
docker save http-server:latest -o http-server.tar
# On the recipient's machine — load and run it
docker load -i http-server.tar
docker run -d --name http-server -p 8080:8080 http-server:latest| Decision | Reason |
|---|---|
python:3.12-slim base |
~50 MB vs ~1 GB for the full image. No build tools needed since there are no C extensions. |
Non-root appuser |
Containers running as root give a compromised process full host access. A dedicated system user limits blast radius. |
Explicit COPY *.py |
Prevents secrets (.env, credentials) from being baked into the image inadvertently. |
HEALTHCHECK via stdlib urllib |
No curl in the slim image. Python stdlib covers the probe without adding any packages or image size. |
CMD ["python", "main.py"] exec form |
Exec form means the Python process is PID 1 and receives SIGTERM directly. Shell form (CMD python main.py) would make sh PID 1 and the signal would never reach the application. |
--start-period=10s |
Gives the server time to bind and start accepting before health probes begin — prevents false unhealthy marks on slow hosts. |
Plain-text home page listing all routes.
curl http://localhost:8080/Welcome to the HTTP/1.1 server built from scratch!
Available routes:
GET / -> this page
GET /api/hello -> JSON greeting
GET /about -> HTML static page
Observability:
GET /health -> liveness check (status + uptime)
GET /metrics -> request counters + thread pool stats
JSON response demonstrating structured data serialisation.
curl http://localhost:8080/api/hello{
"message": "Hello from the scratch-built HTTP server!",
"method": "GET",
"path": "/api/hello",
"server": "python-raw/1.0"
}Static HTML page read from www/index.html with Content-Type: text/html.
curl http://localhost:8080/about
# or open in a browser| Scenario | Status |
|---|---|
| Path not registered | 404 Not Found |
| Malformed HTTP bytes | 400 Bad Request |
| Handler raises an exception | 500 Internal Server Error |
The server exposes two dedicated observability endpoints. Neither endpoint is counted in the metrics — health-check probes should not inflate application traffic statistics.
Purpose: Liveness probe. Used by Docker, Kubernetes, AWS ECS, and load balancers to decide whether this instance is alive and should receive traffic. A non-200 response triggers a restart or replacement.
curl http://localhost:8080/health{
"status": "ok",
"uptime_seconds": 142.3,
"timestamp": "2026-03-30T10:02:22Z"
}| Field | Description |
|---|---|
status |
Always "ok" while the process is running |
uptime_seconds |
Seconds since the server started — useful for detecting unexpected restarts |
timestamp |
UTC ISO-8601 — lets log aggregators correlate health events with application logs |
Purpose: Instrumentation endpoint. Shows request volume, error rates, and
live thread pool state. A persistently non-zero queue_depth is a signal
that NUM_WORKERS needs to be increased.
curl http://localhost:8080/metrics{
"uptime_seconds": 142.8,
"requests_total": 58,
"requests_by_status": {
"200": 55,
"404": 3
},
"requests_by_path": {
"/api/hello": 40,
"/": 15,
"/nope": 3
},
"thread_pool": {
"total_workers": 10,
"active_workers": 2,
"idle_workers": 8,
"queue_depth": 0
}
}| Field | Description |
|---|---|
requests_total |
Total requests handled since startup (excludes /health and /metrics) |
requests_by_status |
Breakdown by HTTP status code — fast way to spot error spikes |
requests_by_path |
Sorted highest-count first — shows which routes bear the most load |
thread_pool.active_workers |
Workers currently executing a task |
thread_pool.idle_workers |
Workers blocked on queue.get() waiting for work |
thread_pool.queue_depth |
Requests waiting for a free thread — non-zero = pool is undersized |
The test suite uses Python's built-in unittest — no external test runner needed.
python -m unittest tests/test_request.py -vExpected output:
test_404_reason_phrase (TestHTTPResponse) ... ok
test_blank_line_before_body (TestHTTPResponse) ... ok
test_connection_close_header_present (TestHTTPResponse) ... ok
test_content_length_header_matches_body (TestHTTPResponse) ... ok
test_status_line_format (TestHTTPResponse) ... ok
test_no_exception_on_arbitrary_garbage (TestParseRequest) ... ok
test_parses_headers_into_lowercase_dict (TestParseRequest) ... ok
test_parses_http_version (TestParseRequest) ... ok
test_parses_method_from_valid_get_request (TestParseRequest) ... ok
test_parses_nested_path (TestParseRequest) ... ok
test_parses_path_from_valid_get_request (TestParseRequest) ... ok
test_parses_post_request_with_body (TestParseRequest) ... ok
test_parses_root_path (TestParseRequest) ... ok
test_returns_400_for_empty_bytes (TestParseRequest) ... ok
test_returns_400_for_garbage_bytes (TestParseRequest) ... ok
test_returns_400_for_lowercase_method (TestParseRequest) ... ok
test_returns_400_for_malformed_request_line (TestParseRequest) ... ok
test_returns_400_for_missing_slash_in_path (TestParseRequest) ... ok
test_returns_400_when_header_separator_missing (TestParseRequest) ... ok
test_404_body_contains_path (TestRouter) ... ok
test_add_route_registers_handler (TestRouter) ... ok
test_dispatches_to_registered_route (TestRouter) ... ok
test_handler_exception_returns_500 (TestRouter) ... ok
test_returns_404_for_root_when_not_registered (TestRouter) ... ok
test_returns_404_for_unknown_path (TestRouter) ... ok
Ran 25 tests in 0.018s
OK
| Test Suite | What is tested |
|---|---|
TestParseRequest |
Valid GET/POST, header normalisation, malformed input, garbage bytes |
TestRouter |
Dispatch, 404 fallback, 500 on handler exception, add_route |
TestHTTPResponse |
Status line format, Content-Length, blank-line separator, reason phrases |
Every non-obvious choice in the codebase is documented. The most important ones:
self._server_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)Without this, after a crash or fast restart the OS keeps the port in TIME_WAIT
for ~60 seconds and every restart fails with "Address already in use". Setting
SO_REUSEADDR lets the server rebind immediately.
self._server_socket.settimeout(1.0)accept() blocks the thread indefinitely waiting for a connection. Without a
timeout, Ctrl-C is silently swallowed. A 1-second timeout lets the accept loop
check the _running flag regularly, making shutdown always responsive.
client_socket.settimeout(CLIENT_TIMEOUT) # default: 5.0 sA client that opens a TCP connection but never sends data holds a worker thread hostage until it eventually disconnects — a basic form of resource exhaustion (slow-loris style). The per-client timeout releases the thread after 5 seconds regardless of client behaviour.
for _ in self._workers:
self._task_queue.put(None) # one sentinel per threadWorkers block on queue.get() at zero CPU cost when idle. Sending exactly one
None per worker guarantees each thread wakes up and exits exactly once — even
if some workers are currently processing tasks. This is simpler and more
reliable than a shared threading.Event because it travels through the same
queue as real work.
with self._workers_lock:
self._active_workers += 1
try:
task()
finally:
with self._workers_lock:
self._active_workers -= 1The finally clause guarantees the count is decremented even if the task
raises. The lock window is microseconds — only the counter increment/decrement,
not the task itself. This keeps lock contention negligible while giving the
/metrics endpoint accurate live data.
headers[name.strip().lower()] = value.strip()RFC 7230 §3.2 states HTTP header names are case-insensitive. Normalising at
parse time means every handler uses request.headers["content-type"]
without worrying about Content-Type vs CONTENT-TYPE vs content-type.
@dataclass
class HTTPRequest:
error: Optional[int] = NoneThe parser never raises. A malformed request returns an HTTPRequest with
error=400. This keeps error handling at the call site rather than scattered
across try/except blocks in the call stack, and makes the parser trivial to
unit test.
server = TCPServer(host=HOST, port=PORT, connection_handler=enqueue_connection)TCPServer knows nothing about threading or HTTP. The handler is injected at
construction, so the same server can accept a synchronous handler in tests or
swap the thread pool for a process pool without touching server.py.
_INTERNAL_PATHS: frozenset[str] = frozenset({"/health", "/metrics"})Docker and Kubernetes send a health probe every 30 seconds. Without exclusion,
those probes would dwarf real application traffic in the requests_by_path
breakdown and make the metrics meaningless. Internal paths are filtered silently
at record time.
| Feature | Why omitted |
|---|---|
| Keep-Alive / persistent connections | Requires tracking Content-Length or Transfer-Encoding: chunked to find request boundaries. Significant complexity for minimal gain at this scale. |
| Chunked transfer encoding | Not needed when every response has a known size at serialisation time. |
| URL query-string parsing | Out of scope; none of the demo routes use query parameters. |
| CORS headers | Application-layer concern; belongs in handlers, not the server core. |
| HTTPS / TLS | Would require ssl.wrap_socket() and certificate management, obscuring the core HTTP concepts. |
Async I/O (asyncio) |
Threading is more readable for an educational implementation. Async is the right choice for 10,000+ concurrent connections — out of scope here. |
| Concept | Where |
|---|---|
| TCP socket programming | server.py |
| HTTP/1.1 wire protocol (RFC 7230 / 7231) | request.py, response.py |
| Producer-consumer concurrency | thread_pool.py |
Thread-safe shared state with threading.Lock |
metrics.py, thread_pool.py |
| Separation of concerns | Each module has a single responsibility |
| Dependency injection | TCPServer(connection_handler=...) |
| Graceful shutdown (OS signal handling) | main.py |
| Defensive parsing — never crash on bad input | request.py |
| Structured logging / observability | All modules via logging |
| Liveness and readiness probes | /health, /metrics endpoints |
| Type safety — full type hints | All function signatures and class variables |
| Unit testing without mocking frameworks | tests/test_request.py |
Containerisation — non-root, HEALTHCHECK, .dockerignore |
Dockerfile |
- RFC 7230 — HTTP/1.1 Message Syntax and Routing
- RFC 7231 — HTTP/1.1 Semantics and Content
- Python
socketdocumentation - Python
threadingdocumentation - Docker best practices — non-root users
- Docker HEALTHCHECK reference