A Rust implementation of a BitTorrent Mainline DHT node (BEP 5 + BEP 42),
intended for deployment on a Linux server with a public IPv4 address. The node
does not download or upload torrent content — it participates as a pure
DHT citizen: serving ping, find_node, get_peers, and announce_peer,
maintaining a routing table, handing out tokens, and helping the swarm.
- Sans-IO core. The DHT state machine (
engine::Engine) has no networking dependencies. It receives(src, bytes, now)and tick events and returns bounded lists ofActions (SendPacket { dst, bytes }). This makes it trivially testable without touching real sockets. - Tokio I/O driver. A thin
net::runadapter wraps the engine in a singleUdpSocket+tokio::select!loop over recv / 1s tick / shutdown signal. - Codegen from JSON.
protocol/messages.jsondeclares the inner shape of each KRPC method (args + response).build.rsreads that schema and emitsOUT_DIR/generated.rscontaining Rust structs andMethod/Query/TypedResponseenums viaproc-macro2+quote+prettyplease. The KRPC envelope itself is hand-written because responses don't carry the method name on the wire — the engine resolves them through outstanding-transaction context. - In-memory routing table. Standard 160-bit XOR k-buckets, k=8, with a bounded replacement cache per bucket and lazy splitting along the path that contains our own node id. Per-node Good/Questionable/Bad classification per BEP 5. State is snapshotted to disk periodically (atomic temp-then-rename).
- BEP 42 node IDs. When a public IPv4 is supplied, the self-id is derived
from
crc32cof(ip & mask) | (rand & 0x07) << 5. Strict mode (off by default) refuses to insert non-conforming node ids into the routing table. - No real-network tests. All tests run offline: protocol round-trips, BEP 42 spec vectors, routing table behaviour, and engine scenarios driven with synthetic packets and a manual clock.
cargo build --release
./target/release/dht-rs --bind 0.0.0.0:6881 --public-ip <YOUR_PUBLIC_IP>CLI:
dht-rs --help
--bind <SOCKET> UDP bind addr (default 0.0.0.0:6881)
--bootstrap <HOST:PORT> Override bootstrap nodes (repeatable)
--state-file <PATH> Persistent state path (default ./dht-state.json)
--public-ip <IPV4> Public IPv4 hint for BEP 42 id derivation
--strict-bep42 Reject non-BEP-42 node ids
--log <FILTER> tracing-subscriber filter (default dht_rs=info)
--no-bootstrap Don't resolve/contact bootstrap nodes
Logs are emitted via the tracing crate. The default filter is dht_rs=info;
override with --log dht_rs=debug or RUST_LOG=dht_rs=debug.
dht-rs/
├── build.rs # JSON → Rust codegen
├── protocol/messages.json # KRPC inner-message schema (source of truth)
├── src/
│ ├── lib.rs # module re-exports
│ ├── main.rs # CLI + tokio runtime + persistence wiring
│ ├── cli.rs # clap definitions
│ ├── id.rs # NodeId, InfoHash, distance, BEP 42
│ ├── routing/mod.rs # k-buckets, replacement cache, refresh
│ ├── protocol/
│ │ ├── mod.rs # includes generated.rs + submodules
│ │ ├── envelope.rs # KRPC envelope (hand-written)
│ │ └── compact.rs # compact node/peer (de)serialization
│ ├── engine/
│ │ ├── mod.rs # sans-IO Engine state machine
│ │ ├── lookup.rs # iterative find_node lookups
│ │ ├── tokens.rs # rotating token store
│ │ ├── peers.rs # info-hash → peers store
│ │ └── transactions.rs # outstanding transactions
│ ├── net.rs # tokio UDP driver
│ └── state.rs # JSON snapshot persistence
└── tests/
├── protocol_roundtrip.rs # bencode round-trips
├── bep42.rs # BEP 42 spec vectors
└── engine_scenarios.rs # synthetic-packet engine tests
cargo testAll tests are offline:
protocol_roundtrip— every query/response/error variant decodes and re-encodes round-trip-cleanly, including envelope-levelv,ip, andro.bep42— node-id derivation against the BEP 42 spec test vectors.engine_scenarios— synthetic packets driven through oneEnginewith a manualInstant:pingreply,get_peersreturns nodes when no peers,announce_peerrejects bad tokens, fullget_peers → announce → get_peersloop returns the announced peer (withimplied_port=true), malformed packets yield no actions, unknown methods reply with KRPC error 204, and bootstrap emits an outgoingfind_nodefor our own id.
See the in-source comments and engine::Limits for tunables (bounded
in-flight transactions, peer caps, action budget, etc.).
Open simplifications worth knowing about:
- IPv4 only. IPv6 nodes are silently ignored.
- The token store keys tokens on source IP (not port) to tolerate NAT rebinding. Secret rotates every 5 minutes with a 5-minute grace window.
- Snapshots (every 5 minutes by default in
net::run, plus once on graceful shutdown) only persist nodes currently classified as Good. - No external-IP voting yet: if you start without
--public-ip, the self-id is purely random (won't be BEP 42-conforming).