DiskDBLite

Overview

DiskDBLite is a from-scratch, disk-backed database engine built to understand how real databases store, index, and retrieve data at the byte and page level. The project focuses on core systems concepts such as B-Tree indexing, slotted page layouts, and SQL parsing — without relying on existing database libraries.

Features

Disk-backed storage - Data persists to disk using fixed-size 4KB pages
B-Tree indexing - Efficient O(log n) key lookups with automatic page splitting
SQL interface - Supports CREATE TABLE, INSERT, and SELECT statements
Interactive shell - REPL for executing SQL commands
Persistence - Database survives restarts with full data integrity
Metrics harness - Non-interactive benchmark and restart-time scripts with JSON output

Quick Start

Build

mvn clean package

Run Interactive Shell

# Using Maven
mvn exec:java -Dexec.mainClass="com.diskdblite.Main" -Dexec.args="mydb.db"

# Or using the JAR directly
java -jar target/diskdblite-1.0-SNAPSHOT.jar mydb.db

Run Metrics Harness

# Non-interactive benchmark (writes artifacts/bench/<timestamp>.json)
./scripts/bench.sh

# Restart-time probe (writes artifacts/recovery/<timestamp>.json)
./scripts/recovery.sh

Detailed metric definitions and reproducible commands are documented in docs/metrics.md.

Example Session

╔═══════════════════════════════════════════════════════╗
║         DiskDBLite v1.0.0 - Disk-Backed Database      ║
╚═══════════════════════════════════════════════════════╝

Created new database: mydb.db

Type 'help' for commands, 'exit' to quit.

diskdb> CREATE TABLE users (id INT PRIMARY KEY, name VARCHAR, email VARCHAR);
Table 'users' created.
Time: 5ms

diskdb> INSERT INTO users VALUES (1, 'Alice', 'alice@example.com');
1 row inserted.
Time: 2ms

diskdb> INSERT INTO users VALUES (2, 'Bob', 'bob@example.com');
1 row inserted.
Time: 1ms

diskdb> SELECT * FROM users;
id | name | email
------------------
1 | Alice | alice@example.com
2 | Bob | bob@example.com
(2 row(s))
Time: 1ms

diskdb> SELECT * FROM users WHERE id = 1;
id | name | email
------------------
1 | Alice | alice@example.com
(1 row(s))
Time: 0ms

diskdb> exit
Goodbye!

Removing a database

The database is a single file on disk (e.g. mydb.db). There is no in-app command to drop or clear it. To remove a database:

Exit the shell (exit or quit) so the file is not in use.
Delete the file (e.g. rm mydb.db in a terminal, or delete it in your file manager).

Supported SQL

CREATE TABLE

CREATE TABLE tablename (
    column1 INT PRIMARY KEY,
    column2 VARCHAR,
    column3 INTEGER
)

INSERT

INSERT INTO tablename VALUES (1, 'text', 42)

SELECT

-- Select all rows
SELECT * FROM tablename

-- Select with WHERE clause
SELECT * FROM tablename WHERE column = value

Shell Commands

Command	Description
`help`	Show help
`tables`	List all tables
`describe <table>`	Show table schema
`exit`	Exit the shell

Architecture

┌─────────────────────────────────────────────────────────────┐
│                        SQL Shell                            │
│                      (Main.java)                            │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                      SQL Engine                             │
│                    (sql package)                            │
│         Lexer → Parser → Statement Objects                  │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                    Query Executor                           │
│                   (query package)                           │
│              High-level CRUD operations                     │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                       Catalog                               │
│                  (catalog package)                          │
│              Table metadata management                      │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                        B-Tree                               │
│                   (btree package)                           │
│          Key-value storage with indexing                    │
│     BTree ← BTreeLeafPage / BTreeInternalPage               │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                      Storage Layer                          │
│                  (storage package)                          │
│   Page → SlottedPage → FileManager → Disk                   │
└─────────────────────────────────────────────────────────────┘

Project Structure

src/main/java/com/diskdblite/
├── Main.java                    # Interactive SQL shell
├── bench/
│   ├── BenchRunner.java         # Non-interactive benchmark runner
│   └── RestartProbe.java        # Single-query restart-time probe
├── storage/
│   ├── Constants.java           # PAGE_SIZE, etc.
│   ├── FileManager.java         # Low-level file I/O
│   ├── Page.java                # 4KB page abstraction
│   ├── PageType.java            # Page type enum
│   └── SlottedPage.java         # Variable-length record storage
├── btree/
│   ├── BTree.java               # B-Tree index structure
│   ├── BTreeLeafPage.java       # Leaf nodes (store data)
│   └── BTreeInternalPage.java   # Internal nodes (routing)
├── catalog/
│   ├── Catalog.java             # Table registry
│   ├── TableInfo.java           # Table metadata
│   ├── ColumnInfo.java          # Column metadata
│   ├── ColumnType.java          # INT, VARCHAR types
│   └── Row.java                 # Row representation
├── query/
│   ├── QueryExecutor.java       # Execute operations
│   └── QueryResult.java         # Query results
└── sql/
    ├── Lexer.java               # SQL tokenizer
    ├── Token.java               # Token types
    ├── Parser.java              # SQL parser
    ├── Statement.java           # AST nodes
    ├── ParseException.java      # Parse errors
    └── SQLEngine.java           # Parsing + execution

scripts/
├── bench.sh                     # Build + benchmark runner wrapper
└── recovery.sh                  # Build + restart-time probe wrapper

docs/
└── metrics.md                   # Metric definitions and reproducibility guide

Design Principles

Disk-backed storage - All data persists to disk in 4KB pages
Explicit page layout - Clear byte-level serialization
Separation of concerns - Each layer has a single responsibility
Correctness over performance - Readable code, not optimized
Minimal features - INSERT and SELECT only; UPDATE and DELETE are omitted because they would require altering the B-tree (key removal, rebalancing) and significantly complicate the implementation
No concurrency - Single-threaded, no locks needed
No transactions - No ACID guarantees beyond durability

Running Tests

# Run all tests
mvn test

# Run specific test class
mvn test -Dtest=IntegrationTest

# Run with output
mvn test -Dtest=IntegrationTest -q

Running Benchmarks

# Standard benchmark run
./scripts/bench.sh

# 5 benchmark trials
for i in 1 2 3 4 5; do ./scripts/bench.sh; done

# Restart-time measurement
./scripts/recovery.sh

Key Concepts Demonstrated

B-Tree indexing with automatic page splitting
Slotted page layout for variable-length records
Serialization of structured data to/from bytes
SQL parsing with lexer/parser architecture
Layered architecture for database systems

Limitations

Primary key must be INTEGER type
No UPDATE or DELETE (would require B-tree key removal and rebalancing)
No transactions or recovery
No concurrent access
No query optimizer
SELECT only supports * (no column selection)
WHERE only supports equality (=)
In the interactive shell, arrow keys may show as escape sequences (e.g. ^[[A); the shell still runs the correct command. For less visual clutter, avoid arrow keys or run in an environment that supports line editing.

License

Educational project - not for production use.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
docs		docs
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DiskDBLite

Overview

Features

Quick Start

Build

Run Interactive Shell

Run Metrics Harness

Example Session

Removing a database

Supported SQL

CREATE TABLE

INSERT

SELECT

Shell Commands

Architecture

Project Structure

Design Principles

Running Tests

Running Benchmarks

Key Concepts Demonstrated

Limitations

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DiskDBLite

Overview

Features

Quick Start

Build

Run Interactive Shell

Run Metrics Harness

Example Session

Removing a database

Supported SQL

CREATE TABLE

INSERT

SELECT

Shell Commands

Architecture

Project Structure

Design Principles

Running Tests

Running Benchmarks

Key Concepts Demonstrated

Limitations

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages