Skip to content

AssetsArt/k3rs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

314 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

K3rs

🚧 Experimental — under heavy development. This project is an experiment in AI-driven software development. The vast majority of the code, architecture, tests, and documentation were generated by AI (Claude Code & Google Antigravity). Humans direct architecture, priorities, and design decisions, but have not reviewed most of the code line-by-line. Treat this accordingly — there will be bugs, rough edges, and things that don't work yet. Use at your own risk.

A lightweight container orchestration platform written in Rust.

Inspired by K3s, powered by Cloudflare Pingora, Axum, and SlateDB. See the full specification for architecture details, API reference, and store schema.

┌─────────────────────────────────────────────────────────────┐
│  k3rs-server (Control Plane)                                │
│  ┌──────────┐ ┌───────────┐ ┌─────────────────────────────┐ │
│  │ API      │ │ Scheduler │ │ Controller Manager          │ │
│  │ (Axum)   │ │           │ │ (8 controllers + VPC)       │ │
│  └──────────┘ └───────────┘ └─────────────────────────────┘ │
│  ┌──────────┐ ┌───────────┐ ┌──────────┐                    │
│  │ SlateDB  │ │ Leader    │ │ PKI / CA │                    │
│  │ (S3/R2)  │ │ Election  │ │ (mTLS)   │                    │
│  └──────────┘ └───────────┘ └──────────┘                    │
└─────────────────────────────────────────────────────────────┘
        ▲                           ▲
        │  mTLS                     │  mTLS
        ▼                           ▼
┌───────────────────────────┐  ┌───────────────────────────┐
│  Node                     │  │  Node                     │
│  ┌──────────────────────┐ │  │  ┌──────────────────────┐ │
│  │ k3rs-vpc             │ │  │  │ k3rs-vpc             │ │
│  │ Ghost IPv6 allocator │ │  │  │ Ghost IPv6 allocator │ │
│  │ eBPF SIIT + VPC      │ │  │  │ eBPF SIIT + VPC      │ │
│  │ NAT64 (eBPF)         │ │  │  │ NAT64 (eBPF)         │ │
│  └──────────────────────┘ │  │  └──────────────────────┘ │
│  ┌──────────────────────┐ │  │  ┌──────────────────────┐ │
│  │ k3rs-agent           │ │  │  │ k3rs-agent           │ │
│  │ Runtime, Proxy, DNS  │ │  │  │ Runtime, Proxy, DNS  │ │
│  └──────────────────────┘ │  │  └──────────────────────┘ │
│                           │  │                           │
│  [OCI Pod]  [VM Pod]      │  │  [OCI Pod]  [VM Pod]      │
│   ↕ netkit   ↕ TAP       │  │   ↕ netkit   ↕ TAP       │
│       (IPv6 only)         │  │       (IPv6 only)         │
│  ┌──────────────────────┐ │  │  ┌──────────────────────┐ │
│  │ k3rs0 bridge (IPv6)  │ │  │  │ k3rs0 bridge (IPv6)  │ │
│  └──────────────────────┘ │  │  └──────────────────────┘ │
└───────────────────────────┘  └───────────────────────────┘

Features

  • Pure Rust — memory-safe, single binary per component
  • Control Plane / Data Plane — server never touches containers; agents execute
  • Fail-Static — restart server, agent, or VPC daemon without disrupting running pods
  • SlateDB — embedded state store on object storage (S3/R2/MinIO), no etcd
  • VPC Networking — first-class VPC resource with Ghost IPv6 addressing, overlapping IPv4 CIDRs, hard VPC isolation
  • eBPF Data Plane — per-pod SIIT (IPv4↔IPv6 translation), VPC enforcement via TC classifiers, NAT64 for external IPv4
  • Symmetric VM SIIT — OCI and Firecracker VMs use identical SIIT architecture; VMs do in-guest translation via k3rs-init
  • Pingora — L4/L7 service proxy, tunnel proxy, ingress controller
  • Platform-Aware Runtime — Virtualization.framework microVMs (macOS), Firecracker (Linux), youki/crun OCI (Linux)
  • Management UI — Dioxus 0.7 web dashboard
  • Backup/Restore — online backup & restore via API, multi-server safe

Quick Start

Prerequisites

  • Rust 1.93.1+ (edition 2024)
  • macOS (primary dev platform) or Linux

Run the Server

# Clone
git clone https://github.com/AssetsArt/k3rs.git
cd k3rs

# Start server (port 6443, local SlateDB)
cargo run --bin k3rs-server -- \
  --port 6443 \
  --token demo-token-123 \
  --data-dir /var/lib/k3rs/server \
  --node-name master-1

Run an Agent

cargo run --bin k3rs-agent -- \
  --server http://127.0.0.1:6443 \
  --token demo-token-123 \
  --node-name node-1

CLI

# Cluster info
cargo run --bin k3rsctl -- --server http://127.0.0.1:6443 cluster info

# List nodes
cargo run --bin k3rsctl -- --server http://127.0.0.1:6443 node list

# Apply a manifest
cargo run --bin k3rsctl -- --server http://127.0.0.1:6443 apply -f manifest.yaml

# Get resources
cargo run --bin k3rsctl -- --server http://127.0.0.1:6443 get pods
cargo run --bin k3rsctl -- --server http://127.0.0.1:6443 get deployments
cargo run --bin k3rsctl -- --server http://127.0.0.1:6443 get services

# Logs & exec
cargo run --bin k3rsctl -- --server http://127.0.0.1:6443 logs <pod-name>
cargo run --bin k3rsctl -- --server http://127.0.0.1:6443 exec <pod-name> -- <command>

# Node operations
cargo run --bin k3rsctl -- --server http://127.0.0.1:6443 node drain <node-name>
cargo run --bin k3rsctl -- --server http://127.0.0.1:6443 node cordon <node-name>
cargo run --bin k3rsctl -- --server http://127.0.0.1:6443 node uncordon <node-name>

# Backup & restore
cargo run --bin k3rsctl -- --server http://127.0.0.1:6443 backup create --output ./backup.k3rs-backup.json.gz
cargo run --bin k3rsctl -- --server http://127.0.0.1:6443 restore --from ./backup.k3rs-backup.json.gz

Development

Dev Scripts (macOS)

# Full dev environment (tmux: server + UI)
./scripts/dev.sh

# Server only (cargo-watch auto-reload)
./scripts/dev-server.sh

# Agent only
./scripts/dev-agent.sh

# UI only (Dioxus)
./scripts/dev-ui.sh

Dev with Podman (Linux — OCI & Firecracker)

For testing OCI runtimes (youki/crun) and Firecracker on Linux:

# Interactive shell with Linux environment
./scripts/dev-podman.sh

# Run server inside container
./scripts/dev-podman.sh server

# Run agent inside container
./scripts/dev-podman.sh agent

# Run tests
./scripts/dev-podman.sh test

# With KVM passthrough (Firecracker)
./scripts/dev-podman.sh --kvm shell

# With ALL
./scripts/dev-podman.sh --all --ui

# Environment variables
K3RS_RUNTIME=crun ./scripts/dev-podman.sh agent   # use crun instead of youki
K3RS_KVM=1 ./scripts/dev-podman.sh agent           # enable KVM

The Podman container includes: Rust toolchain, youki, crun, Firecracker, cargo-watch, and persistent cargo cache volumes.

Build k3rs-init (Guest VM PID 1)

# Cross-compile from macOS → Linux musl (static binary)
cargo zigbuild --release --target aarch64-unknown-linux-musl -p k3rs-init

# Or build kernel + initrd for microVMs
./scripts/build-kernel.sh

Configuration

Both server and agent support YAML config files:

Server (/etc/k3rs/config.yaml):

port: 6443
data-dir: /var/lib/k3rs/server
token: my-secret-token

Agent (/etc/k3rs/agent-config.yaml):

server: https://10.0.0.1:6443
token: my-secret-token
node-name: worker-1
dns-port: 5353

Config precedence: CLI flags > YAML config > defaults

Path Layout

All paths are derived from 3 base constants defined in pkg/constants/src/paths.rs:

Constant Value Purpose
CONFIG_DIR /etc/k3rs Configuration, TLS certs
DATA_DIR /var/lib/k3rs State, runtime, binaries, kernel
LOG_DIR /var/logs/k3rs Log files

Uninstall: rm -rf /etc/k3rs /var/lib/k3rs /var/logs/k3rs

Architecture

Control Plane (k3rs-server)

The server is a pure control plane — it never runs containers:

Component Purpose
API Server (Axum) REST API for all cluster operations
Scheduler Resource-aware pod placement with affinity/taint support
Controller Manager 8 controllers: Node, Deployment, ReplicaSet, DaemonSet, Job, CronJob, HPA, Eviction
SlateDB Embedded KV store on object storage (S3/R2/local)
Leader Election Lease-based HA — only leader runs controllers
PKI / CA Issues mTLS certificates to agents
Metrics Prometheus-compatible /metrics endpoint

VPC Daemon (k3rs-vpc)

Standalone per-node daemon — sole authority for networking. Independent from agent and server:

Component Purpose
Ghost IPv6 Allocator Allocates (GuestIPv4, GhostIPv6) pairs per pod from VPC pools
eBPF Enforcer Per-pod SIIT + VPC classifiers on netkit (OCI) and tap_guard on TAP (VM)
NAT64 eBPF IPv6→IPv4 translation for external connectivity via 64:ff9b::/96
VPC Sync Loop Pulls VPC definitions and peerings from server every 10s
VpcStore Own SlateDB instance for allocations, independent crash recovery
Unix Socket API NDJSON protocol at /run/k3rs-vpc.sock — agent delegates all network ops

Data Plane (k3rs-agent)

The agent runs on worker nodes and manages the container lifecycle:

Component Purpose
Pod Sync Loop Watches scheduled pods → pull image → create → start → monitor
Container Runtime Platform-aware: Virtualization.framework (macOS), Firecracker/youki/crun (Linux)
Service Proxy (Pingora) L4/L7 load balancing, replaces kube-proxy
Tunnel Proxy (Pingora) Persistent reverse tunnel to server
DNS Server (DNS64) Resolves <svc>.<ns>.svc.cluster.local, synthesizes AAAA with Ghost IPv6

Container Runtime

Platform Backend Technology
macOS VirtualizationBackend Apple Virtualization.framework microVMs
Linux (KVM) FirecrackerBackend Firecracker via rust-vmm crates
Linux (no KVM) OciBackend youki / crun OCI runtimes

VPC Networking & Ghost IPv6

K3rs uses IPv6-native pod networking with Ghost IPv6 — a deterministic 128-bit address encoding (ClusterID, VpcID, GuestIPv4). Apps see IPv4 inside pods; SIIT translates to Ghost IPv6 at the interface boundary. The bridge is pure IPv6.

Ghost IPv6 layout (128 bits):
┌──────────────┬────────┬────────────┬──────────┬──────────────┐
│ Platform     │ Ver    │ ClusterID  │ VpcID    │ GuestIPv4    │
│ Prefix (32b) │ (4b)   │ (32b)      │ (16b)    │ (32b)        │
└──────────────┴────────┴────────────┴──────────┴──────────────┘

OCI containers: SIIT runs on host-side netkit — k3rs-vpc loads per-pod eBPF. VMs (Firecracker): SIIT runs inside the guest — k3rs-init loads the same eBPF on eth0. Host TAP gets tap_guard anti-spoofing only.

OCI:  Pod [app →IPv4→ eBPF(eth0) →IPv6] ==netkit== Host(IPv6) ==bridge== ...
VM:   VM  [app →IPv4→ eBPF(eth0) →IPv6] ==TAP==    Host(IPv6) ==bridge== ...

Key properties:

  • No route tables — 1 route per node covers all pods in all VPCs
  • Overlapping IPv4 CIDRs — different VPCs can use the same IPv4 range
  • Hard VPC isolation — VPC ID extracted from IPv6 bytes 10-11, no BPF map lookup
  • IPv4 via NAT64/DNS64 — transparent external IPv4 access

State Store

All cluster state lives in SlateDB under structured key prefixes:

/registry/nodes/<name>                   → Node metadata & status
/registry/namespaces/<ns>                → Namespace definition
/registry/pods/<ns>/<name>               → Pod spec & status
/registry/deployments/<ns>/<name>        → Deployment spec & status
/registry/replicasets/<ns>/<name>        → ReplicaSet spec & status
/registry/daemonsets/<ns>/<name>         → DaemonSet spec & status
/registry/jobs/<ns>/<name>               → Job spec & status
/registry/cronjobs/<ns>/<name>           → CronJob spec & status
/registry/services/<ns>/<name>           → Service definition
/registry/endpoints/<ns>/<name>          → Endpoint slice
/registry/ingresses/<ns>/<name>          → Ingress rules
/registry/configmaps/<ns>/<name>         → ConfigMap data
/registry/secrets/<ns>/<name>            → Secret data (encrypted)
/registry/hpa/<ns>/<name>                → Horizontal Pod Autoscaler
/registry/resourcequotas/<ns>/<name>     → Resource quota
/registry/networkpolicies/<ns>/<name>    → Network policy
/registry/pvcs/<ns>/<name>               → Persistent volume claim
/registry/vpcs/<name>                    → VPC definition (vpc_id, cidr)
/registry/vpc-peerings/<name>            → VPC peering rules
/registry/images/<node-name>             → Per-node image list
/registry/leases/controller-leader       → Leader election lease

Fail-Static Guarantees

Restart any component without disrupting running workloads.

Scenario Running Containers Service Proxy DNS VPC Enforcement
Server restart ✅ Unaffected ✅ Continues ✅ Continues ✅ eBPF persists
Agent restart ✅ Independent processes ❌→✅ Restarts ❌→✅ Restarts ✅ eBPF persists
VPC daemon restart ✅ Unaffected ✅ Continues ✅ Continues ✅ eBPF pinned in kernel
All restart ✅ Unaffected ❌→✅ Restarts ❌→✅ Restarts ✅ eBPF pinned in kernel

Key invariant: kill -9 <agent-pid> must never cause any container to stop.

Workload Resources

Resource Description
Pod Smallest deployable unit
Deployment Manages ReplicaSets, rolling updates, blue/green, canary
ReplicaSet Maintains N pod replicas
DaemonSet One pod per node (or selected nodes)
Job Run-to-completion workload
CronJob Scheduled jobs
Service ClusterIP / NodePort / LoadBalancer
Ingress Host/path-based external routing
ConfigMap Configuration data
Secret Sensitive data (encrypted at rest)
HPA Horizontal Pod Autoscaler
NetworkPolicy Pod-level network isolation
ResourceQuota Per-namespace limits
VPC Virtual Private Cloud — isolated IPv4 network with Ghost IPv6
VPC Peering Cross-VPC connectivity (bidirectional or initiator-only)
PVC Persistent volume claims

Security

  • mTLS everywhere — Server ↔ Agent, auto-rotated certificates
  • Join token — Agents register with pre-shared token, receive TLS cert
  • RBACcluster-admin, namespace-admin, viewer roles
  • Service accounts — Scoped tokens for workload API access
  • Secrets encrypted at rest in SlateDB

Project Structure

k3rs/
├── cmd/
│   ├── k3rs-server/       # Control plane binary
│   ├── k3rs-agent/        # Data plane binary
│   ├── k3rs-vpc/          # VPC daemon (Ghost IPv6 allocator + eBPF enforcer)
│   ├── k3rs-vpc-ebpf/     # eBPF programs (SIIT, VPC classifiers, NAT64, tap_guard)
│   ├── k3rs-vpc-common/   # Shared types between eBPF and userspace (BPF map keys/values)
│   ├── k3rs-init/         # Guest PID 1 for microVMs (static musl, in-guest SIIT)
│   ├── k3rs-vmm/          # Virtualization.framework helper (macOS, Rust + objc2-virtualization)
│   ├── k3rs-ui/           # Management UI (Dioxus 0.7)
│   └── k3rsctl/           # CLI tool
├── pkg/
│   ├── api/               # Axum HTTP API & handlers
│   ├── constants/         # Centralized constants (paths, network, runtime, auth, state, vm)
│   ├── container/         # Container runtime (Virtualization/Firecracker/OCI)
│   ├── controllers/       # 8 control loops + VPC controller
│   ├── metrics/           # Prometheus metrics registry
│   ├── network/           # Bridge, netkit, DNS64
│   ├── pki/               # CA & mTLS certificates
│   ├── proxy/             # Pingora proxies (Service/Ingress/Tunnel)
│   ├── scheduler/         # Pod placement logic
│   ├── state/             # SlateDB integration
│   ├── types/             # Cluster object models
│   └── vpc/               # Ghost IPv6 construct/parse/validate library
├── scripts/
│   ├── dev.sh             # Full dev environment (tmux)
│   ├── dev-agent.sh       # Agent dev loop
│   ├── dev-podman.sh      # Podman Linux dev environment
│   ├── build-kernel.sh    # Build Linux kernel + initrd for microVMs
│   └── ...
├── Containerfile.dev      # Podman dev image (Rust + youki + crun + Firecracker)
├── Cargo.toml             # Workspace root
└── spec.md                # Full specification

Tech Stack

Category Technology
Language Rust (edition 2024)
HTTP API Axum 0.8
Proxy / Networking Cloudflare Pingora 0.7
State Store SlateDB 0.10 on object storage
eBPF Aya 0.13 (pure Rust eBPF toolchain)
Management UI Dioxus 0.7 (WASM SPA)
Container Runtime Virtualization.framework / Firecracker / youki / crun
Image Pull oci-client (OCI Distribution spec)
DNS Hickory DNS
Serialization serde + serde_json
Async Runtime Tokio
CLI Clap
TLS rustls + rcgen
Telemetry OpenTelemetry + OTLP

License

MIT

About

A lightweight container orchestration platform written in Rust. Inspired by K3s, powered by Cloudflare Pingora, Axum, and SlateDB.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors