A minimal stack-based virtual machine interpreter written in C, capable of executing bytecode programs with three core instructions. Created for educational purposes
miniVM is a simple stack-based VM designed to demonstrate fundamental virtual machine concepts. It reads and executes binary bytecode files (.byc extension) containing a sequence of instructions that manipulate an integer stack.
The VM currently supports three instructions:
| Opcode | Instruction | Description |
|---|---|---|
0x01 |
PUSH | Push an 8-bit value onto the stack (requires 1 operand byte) |
0x02 |
ADD | Pop two values, add them, and push the result |
0x03 |
Print the value at the top of the stack (non-destructive) |
- Fixed-size stack (100 elements)
- Integer values only
- LIFO (Last In, First Out) semantics
- Comprehensive overflow/underflow checking
PUSH <value> → 0x01 0x?? (2 bytes)
ADD → 0x02 (1 byte)
PRINT → 0x03 (1 byte)
miniVM/
├── main.c # VM interpreter and execution engine
├── stack.c # Stack data structure implementation
└── byc_gen.py # Bytecode generator for test programs
makeOr manually:
gcc -o miniVM main.c./miniVM <bytecode_file>$ ./miniVM test1.byc
Program Name: test1.byc
6
0: 1 1: 5 2: 1 3: 4 4: 2 5: 3
PUSH
└─> 5
PUSH
└─> 4
ADD
└─> 5 + 4
PRINT
└─>9Since binary files cannot be easily created with standard text editors, the byc_gen.py script is provided to generate test programs programmatically.
Run the Python script to generate all test programs:
python3 byc_gen.pyOutput:
Wrote test1.byc (6 bytes)
Wrote test2.byc (9 bytes)
Wrote test3.byc (9 bytes)
Wrote test4.byc (12 bytes)
Wrote test5.byc (12 bytes)
Edit byc_gen.py and add your own program:
programs = {
"my_program.byc": make_program(
PUSH, 10,
PUSH, 20,
ADD,
PRINT
),
}The VM utilizes several POSIX system libraries for efficient bytecode loading and execution:
| Library | Header | Purpose |
|---|---|---|
| fcntl | <fcntl.h> |
File control operations (open()) |
| mman | <sys/mman.h> |
Memory-mapped file I/O (mmap()) for efficient bytecode loading |
| unistd | <unistd.h> |
POSIX API (lseek(), close()) |
| stdio | <stdio.h> |
Standard I/O operations |
| stdlib | <stdlib.h> |
Memory allocation and program utilities |
The VM uses mmap() to load bytecode files directly into virtual memory. This approach:
- Eliminates the need for manual buffer allocation
- Leverages the OS page cache for performance
- Provides efficient random access to bytecode
The VM follows a classic fetch-decode-execute cycle:
- Fetch: Read the opcode at the current instruction pointer (IP)
- Decode: Determine which instruction to execute
- Execute: Perform the operation (manipulate stack, advance IP)
- Repeat: Continue until end of bytecode
Bytecode: PUSH 5, PUSH 4, ADD, PRINT
Initial: Stack = []
PUSH 5: Stack = [5]
PUSH 4: Stack = [5, 4]
ADD: Stack = [9] (pop 4, pop 5, push 9)
PRINT: Stack = [9] (output: 9)
The VM performs runtime validation and will terminate with an error message if:
- Stack overflow: Attempting to push when stack is full (100 elements)
- Stack underflow: Attempting to pop/peek from an empty stack
- File not found: Bytecode file doesn't exist
- Memory mapping failure: Unable to map file into memory
- Fixed stack size (100 elements)
- Only integer arithmetic (no floating point)
- No control flow instructions (jumps, branches, loops)
- No function calls or subroutines
- 8-bit immediate values only (0-255)
- Signed char interpretation (values 128-255 may behave unexpectedly)
Potential additions for learning purposes:
SUB,MUL,DIVarithmetic operationsHALTinstruction for explicit program termination- Comparison operations (
EQ,LT,GT) - Conditional and unconditional jumps (
JMP,JZ,JNZ) - Dynamic stack allocation
- Larger immediate values (16-bit, 32-bit)
- Function call/return mechanism
PUSH 5
PUSH 4
ADD
PRINT
# Output: 9PUSH 10
PUSH 3
PUSH 2
ADD # 3 + 2 = 5
ADD # 10 + 5 = 15
PRINT
# Output: 15