A Git implementation written in Rust, inspired by James Coglan’s book "Building Your Own Git". bit is intentionally educational, but it is built with production-style engineering discipline: explicit domain modeling, invariant-driven design, and a strong test suite.
bit models Git as interactions between four core areas:
- Object database (
.git/objects): immutable content-addressed objects. - Index (
.git/index): staging-area metadata and next-tree blueprint. - References (
.git/HEAD,refs/*): symbolic and direct pointers into history. - Workspace: mutable files checked out for editing.
Most command behavior can be reasoned about as transitions across these areas.
- ✅
bit init - ✅
bit hash-object - ✅
bit ls-tree - ✅
bit add - ✅
bit commit - ✅
bit status - ✅
bit diff - ✅
bit branch(create/list/delete) - ✅
bit checkout - ✅
bit log - ✅
bit merge(multi-scenario DAG merges, conflict-aware behavior)
- Supports Git object categories needed by current commands (blob/tree/commit flows).
- Serialization follows Git format:
<type> <size>\0<content>. - Object IDs are SHA-1 of serialized object bytes.
- Objects are immutable once written.
- Entries are deterministically ordered.
- Header entry count reflects serialized entries.
- Index checksum verifies integrity.
- Parent/child path conflicts are normalized when replacing file/dir shapes.
- Index read/write uses locking semantics to maintain consistency under concurrent operations.
- Branch and revision parsing supports common forms (
ref,^,~n, aliases). - Branch name validation follows Git-like constraints and explicit parser rules.
- HEAD and refs are managed as first-class repository state.
- Merge behavior is validated on multiple non-trivial commit DAGs.
- Best common ancestor scenarios and branching edge-cases are covered by integration tests.
- Diff endpoints are explicit (
workspace↔index,index↔HEAD,rev↔rev). - File-level status (
A,D,M, mode-only changes) is deterministic. - Patch hunks should remain stable in ordering and context output semantics.
- Traversal honors included and excluded revision expressions.
- Commit output ordering should be deterministic, including tie-breaking for identical timestamps.
- Decorations are a read-only projection of current refs over commits.
- Checkout is a controlled migration between revisions across refs, index, and workspace.
- Safety checks should prevent silent clobbering of local/staged work.
- Success means workspace/index are synchronized to target tree semantics.
- Merkle DAG for commit/history integrity and reachability.
- Tree/index path hierarchy (trie-like) for parent/child path operations.
- Revision AST to represent references, parent (
^), ancestor (~n), ranges, and exclusions. - Myers-style diff baseline for minimal edit script patch construction (future enhancements may refine heuristics).
- Binary codec discipline for future wire protocol framing and pack parsing.
This project emphasizes:
- idiomatic ownership and borrowing,
- explicit
Result-based error handling, - deterministic filesystem interaction,
- lock-aware repository mutations,
- incremental changes with strong test coverage.
src/
├── main.rs # CLI interface and command routing
├── commands/ # Command implementations
│ ├── plumbing/ # Low-level Git commands
│ └── porcelain/ # User-facing commands
├── areas/ # Repository areas: database/index/refs/workspace
└── artifacts/ # Domain objects and algorithms (objects, diff, merge, log, status)
- Rust 1.93 or later
- Cargo
cargo build --releaseBinary path:
target/release/bit# initialize repository
bit init [path]
# write or hash objects
bit hash-object [-w] <file>
bit ls-tree [-r] <tree-sha>
# staging + commits
bit add <path>...
bit commit -m "message"
# inspect state
bit status [--porcelain]
bit diff [--cached] [--name-status] [--diff-filter=ADMR] [old] [new]
bit log [targets...] [-- --paths] [--oneline] [--abbrev-commit] [--decorate=<none|short|full>] [--patch]
# branch / checkout / merge
bit branch create <name> [source]
bit branch list [-v]
bit branch delete <name>... [-f]
bit checkout <target-revision>
bit merge <target-revision> -m "merge message"The test suite combines unit, property-based, and integration tests to validate Git semantics.
mkdir -p ../playgroundcargo test
cargo test -- --nocaptureTargeted examples:
cargo test add
cargo test commit
cargo test diff
cargo test log
cargo test merge- Write a failing test from behavior/spec.
- Implement minimum code to pass.
- Refactor while preserving green tests.
Use for pure logic and parsers:
- revision syntax,
- branch-name validation,
- object payload formatting/parsing,
- small deterministic transforms.
Use for invariant-heavy behavior:
- valid/invalid revision and branch grammar,
- parser determinism,
- boundary input spaces.
Store generated failing cases in proptest-regressions/ when applicable.
Use command-level scenarios under tests/:
- compare command output and state transitions,
- cover merge graph topologies,
- validate index/workspace/ref interactions,
- compare against
gitwhere practical.
- Merge: fast-forward eligibility, criss-cross BCAs, multi-parent lineage ordering, conflict marker persistence.
- Diff: endpoint correctness, mode-only changes, multi-file hunk stability, deterministic status ordering.
- Log: include/exclude revision expressions, path-filtered history, stable ordering with timestamp ties, decoration rendering variants.
- pkt-line codec round-trips and malformed-frame rejection.
- negotiation state-machine tests for
want/have/ackflows. - pack decoding tests for full objects and chained deltas.
- object-integrity checks after unpack + write.
The roadmap is organized by domain themes that mirror the book’s progression from low-level object mechanics to higher-level collaboration/storage concerns.
- Initialize repository structure
- Hash/write loose objects
- Read/tree-walk object structures
- Additional plumbing introspection and validation commands
- Stage files and directory trees
- Handle file/dir replacement conflicts
- Persist and validate index metadata/checksum
- Interactive staging (
add -p)
- Write commits and parent relationships
- Traverse history for
log - Revision expressions and branch-based targeting
- Extended ancestry/query expressions parity
-
statusfor staged/unstaged/untracked states -
difffor workspace/index/commit comparisons - Patch-oriented log output
- More advanced diff heuristics and rename/copy tracking
- Branch create/list/delete
- Checkout with ref/symbolic-ref behavior
- Merge for complex DAG scenarios (including multi-BCA patterns)
- More complete conflict resolution UX
- Rebase/cherry-pick workflows
- Clone/fetch/push/pull protocols
- Packfiles and delta compression
- Reflog, hooks, and GC lifecycle tooling
- Transport handshake/capabilities: model protocol v2 style negotiation (capabilities, command selection, shallow/filter options where relevant).
- Object negotiation: exchange
want/havesets and ACK states to minimize transfer. - Pack stream framing: use pkt-line length-prefixed binary records with flush/delim packets.
- Pack data model: support object entry headers, base/delta representations (
OFS_DELTA,REF_DELTA), and integrity verification at stream/repo boundaries. - Apply/delta pipeline: decode + resolve base chains + reconstruct canonical objects before persistence.
- Safety invariants: never accept malformed framing/checksums silently; keep object graph/hash checks explicit.
- No remote/network protocol support yet.
- Packed object storage and delta compression are not implemented.
- History-rewriting workflows (rebase/cherry-pick) are pending.
- Merge UX can be improved for richer conflict presentation/resolution workflows.
- Start from a failing test.
- Implement minimally.
- Keep changes small and focused.
- Ensure formatting and tests pass.
- Update README/instructions if behavior changes.
- James Coglan for Building Your Own Git.
- The Git project for behavioral reference.
- The Rust ecosystem for excellent tooling.