Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
171 changes: 63 additions & 108 deletions README-en.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,97 +38,15 @@ Python DSL (Apple) ──compile──> JSON Config
- **Implicit graph construction** — Operators declare input/output fields; engine infers DAG dependencies with transitive reduction
- **Lock-free parallelism** — Independent operators in the DAG execute in parallel automatically
- **Compile-time validation** — Dead code, missing fields, write-after-write detected before deployment
- **Embedded Lua** — Built-in Lua operators for lightweight custom computation. End-to-end overhead ~1.2-2x; isolated operator-level overhead varies by runtime and compute complexity (C++/LuaJIT ~3-5x, Java ~2-9x, Go ~6-17x) — write native operators for compute-heavy hot paths
- **Embedded Lua** — Built-in Lua operators for lightweight custom computation. pine-go defaults to [wangshu](https://github.com/Liam0205/wangshu) (pure-Go Lua 5.1 VM, NaN-boxing + arena GC); switch back to gopher-lua via `-tags=lua_gopher`. pine-java uses LuaJC (bytecode compilation), pine-cpp uses LuaJIT. End-to-end overhead ~1.2-2x; isolated operator-level overhead varies by runtime and compute complexity (C++/LuaJIT ~3-5x, Java ~2-9x, Go ~6-17x) — write native operators for compute-heavy hot paths
- **Hot config reload** — Service automatically reloads engine config without downtime
- **Dynamic resources** — Background-refreshed in-memory resource manager with lock-free reads
- **White-box observability** — Operator-level traces, `/stats` endpoint, pluggable Prometheus interface
- **Dynamic resources** — Two-channel resource manager: **data-typed** (e.g. static dict / real-time feature store, snapshot-exported lock-free reads) + **handle-typed** (e.g. `redis_connection`, borrow lease + RAII teardown); background-refreshed
- **Redis cascade-safety** — The `redis_connection` resource exposes 5 cascade params (`{dial,read,write,pool}_timeout_ms` + `pool_size`); per-command metrics `pine_redis_command_*` with 4-state status (ok / timeout / pool_timeout / error), fail-on-error silent-degradation contract
- **White-box observability** — Operator-level traces; the `/stats` composite response includes `/stats.http` (request-level 4-state metrics) + `/stats.resources` (resource pool / probe / per-command 4-state categories); pluggable Prometheus interface
- **Row/Column storage** — DataFrame supports both storage modes
- **Tri-engine consistency** — Go/Java/C++ engines verified via CI cross-validation for schema, DAG, execution, error, server, and metrics parity
- **Tri-engine consistency** — Go/Java/C++ engines verified byte-exactly via CI cross-validation (19 sections + tri-engine differential fuzz + daily ASan/TSan sanitized fuzz)
- **Pine-C++ benchmark runtime** — Complete third runtime with operator parity, HTTP server (hot reload / graceful shutdown), ColumnFrame/RowFrame dual physical layouts, lazy OperatorInput projection, LuaJIT integration, metrics/resource parity

## Migrating from Older Versions (Breaking Change)

> Starting from v0.7, the Go engine has moved from the repository root into the `pine-go/` subdirectory. The Go module path has changed accordingly.

### What Changed

| Item | Before | After |
|------|--------|-------|
| Module path | `github.com/Liam0205/pineapple` | `github.com/Liam0205/pineapple/pine-go` |
| Import | `github.com/Liam0205/pineapple/internal/...` | `github.com/Liam0205/pineapple/pine-go/internal/...` |
| Import | `github.com/Liam0205/pineapple/pkg/...` | `github.com/Liam0205/pineapple/pine-go/pkg/...` |
| Import | `github.com/Liam0205/pineapple/operators` | `github.com/Liam0205/pineapple/pine-go/operators` |
| Binary | `go build ./cmd/pineapple-server` | `go build ./pine-go/cmd/pineapple-server` |

### Migration Steps

```bash
# 1. Bulk-replace import paths
find . -name '*.go' -exec sed -i \
's|github.com/Liam0205/pineapple/|github.com/Liam0205/pineapple/pine-go/|g' {} +

# 2. Fix double-nesting if you referenced the module itself
find . -name '*.go' -exec sed -i \
's|github.com/Liam0205/pineapple/pine-go/pine-go/|github.com/Liam0205/pineapple/pine-go/|g' {} +

# 3. Update go.mod
go get github.com/Liam0205/pineapple/pine-go@latest
go mod tidy
```

If your project uses Pineapple through public APIs (`pine.NewEngine`, `pine.BuildOperator`, etc.), the above steps complete the migration.

### Configuration & Runtime Semantic Changes

The following changes affect JSON configuration and operator runtime behavior:

#### 1. `row_dependency` Renamed to `consumes_row_set`

The `"row_dependency": true` field in operator JSON config has been removed. Use `"consumes_row_set": true` instead (same semantics: marks the operator as needing a stable row set before execution).

```diff
{
"type_name": "transform_size",
- "row_dependency": true,
+ "consumes_row_set": true,
"$metadata": { ... }
}
```

Apple DSL side: `OpCall(..., row_dependency=True)` → `OpCall(..., consumes_row_set=True)`.

#### 2. DAG Scheduling Model: Barriers → Row-Set Marker Interfaces

Previously, Filter/Merge/Reorder operators acted as "barriers" — all predecessors had to complete before them, and all successors had to wait.

The new model uses three marker interfaces for precise row-set dependency declaration:

| Marker | Meaning | Typical Operators |
|--------|---------|-------------------|
| `ConsumesRowSet` | Iterates all items; needs row set stable | filter_*, merge_*, reorder_*, transform_size |
| `MutatesRowSet` | Removes or reorders items | filter_*, merge_*, reorder_* |
| `AdditiveWritesRowSet` | Appends items (parallel with other appenders) | recall_* |

**Impact**: Transform operators that only touch common fields are no longer blocked by barriers and can execute in parallel with Filter/Merge/Reorder. This improves parallelism without changing final results — correctness is guaranteed by field-level data hazard analysis.

**Custom operator migration**: If you implemented a custom Recall-type operator, embed `types.AdditiveWritesRowSetMarker`.

#### 3. Field Accessor Strict Mode

`BuildInput` now distinguishes Strict vs. Defaulted fields:

- **Strict** (fields without a `common_defaults` / `item_defaults` entry): errors immediately at runtime if the value is nil, instead of passing nil to the operator
- **Defaulted** (fields with a default): substitutes the default when the value is nil or missing

**Impact**: If your pipeline relies on "nil passthrough to operator for self-handling", add a `common_defaults` or `item_defaults` entry for that field (value can be `null`) to preserve the old behavior:

```json
{
"$metadata": { "common_input": ["optional_field"], ... },
"common_defaults": { "optional_field": null }
}
```

## Quick Start

### Prerequisites
Expand Down Expand Up @@ -239,8 +157,28 @@ pineapple/

## Development

### Top-level Make Targets

Cross-language fmt / lint / test / bench / codegen / version management is unified behind the top-level `Makefile` (with `pine-go/Makefile` for Go-specific work). CI and local dev share the same command sequence.

| Make target | Purpose |
|---|---|
| `make fmt` | Format all four languages (gofmt / google-java-format / clang-format / ruff) |
| `make lint` | Lint all four languages (incl. checkstyle `failOnViolation=true`, `-Werror`) |
| `make test` | Full test suite across runtimes |
| `make bench` | Default `pine_bench` tag |
| `make bench-cross-runtime` | Cross-engine fixture-driven benchmark (cgroup-isolated) |
| `make bench-lua-backends` | wangshu vs gopher-lua, same-host serial + benchstat |
| `make differential-fuzz` | Tri-engine differential fuzz |
| `make cross-validate` | Tri-engine consistency verification |
| `make codegen` | Generate `apple_generated/` + `doc/operators/` from pine-go Registry |
| `make codegen-check` | CI: codegen + `git diff --exit-code` to enforce artifact freshness |
| `make check-pr-ci` | Watch CI status of the current branch's PR (pre-push hook calls this) |

### Scripts

`scripts/` holds the actual implementations behind the Make targets and can be invoked standalone:

| Script | Purpose |
|--------|---------|
| `scripts/go-test.sh` | Run all Go tests |
Expand All @@ -250,6 +188,7 @@ pineapple/
| `scripts/go-bench.sh` | Go benchmarks |
| `scripts/java-bench.sh` | Java benchmarks |
| `scripts/bench-cross-runtime.sh` | Cross-engine HTTP server benchmark (fixture-driven, cgroup-isolated) |
| `scripts/bench-lua-backends.sh` | wangshu vs gopher-lua backend comparison (benchstat delta) |
| `scripts/go-fuzz.sh` | Go fuzz testing |
| `scripts/java-fuzz.sh` | Java fuzz testing |
| `scripts/differential-fuzz.sh` | Tri-engine differential fuzzing (random pipelines, output diff) |
Expand All @@ -260,16 +199,25 @@ pineapple/
| `scripts/render-dag.sh` | DAG visualization (`--backend go\|java`) |
| `scripts/apple-compile.sh` | Compile Apple DSL to JSON |
| `scripts/run-pipeline.sh` | One-shot pipeline execution |
| `scripts/bump-version.sh` | Synchronize version across all components |
| `scripts/bump-version.sh` | Synchronize version across all components (incl. pine-cpp `kVersion`) |
| `scripts/check-pr-ci.sh` | Watch CI status of the current branch's PR (pre-push hook invokes this) |

### Local Git Hooks

`.githooks/` ships with the repository; activate via `git config core.hooksPath .githooks` once after clone:

- **`pre-commit`** — staged-only format gate (gofmt / clang-format / ruff); does not touch unstaged work
- **`pre-push`** — project-level lint (four-language fail-on-violation) + self-wrapped post-push CI watcher (auto-runs `check-pr-ci.sh` after the actual push) + auto `--set-upstream` relay (first-push of a new branch does not need a manual `-u`)

### CI Pipeline

CI runs automatically on every push/PR:

- **Lint** — Go (golangci-lint), Java (checkstyle, failOnViolation=true), Python (ruff), C++ (-Werror)
- **Lint** — Go (golangci-lint), Java (checkstyle, failOnViolation=true), Python (ruff), C++ (clang-format -Werror)
- **Test** — Full Go/Java/Apple/C++ test suites with coverage
- **Sanitizer** — C++ ASan/UBSan smoke + ThreadSanitizer stress
- **Fuzz** — Go/Java fuzz + tri-engine differential fuzzing
- **Daily sanitized fuzz** — Daily (12:00 UTC+8) ASan/TSan differential fuzz, 3000+2000 rounds, dedicated to race / memory-bug deep diagnostics (independent of the per-push fast lane)
- **Benchmark** — Go/Java performance benchmarks
- **Cross-validation** — Tri-engine schema/DAG/execution/error/server/metrics parity
- **Codegen check** — Ensures generated code is in sync with source
Expand Down Expand Up @@ -385,39 +333,45 @@ See `scripts/cross-validate.sh` for a complete production implementation.

## Benchmark

Cross-engine performance comparison (HTTP server mode, `scripts/bench-cross-runtime.sh`, 10000 requests × 16 concurrency, server cgroup-isolated to 2C/4G). `realistic_calibrated` is a production proxy fixture calibrated against real traffic; the rest are synthetic stress tests.
Cross-engine performance comparison (HTTP server mode, `scripts/bench-cross-runtime.sh`, 10000 requests × 16 concurrency, server cgroup-isolated to 2C/4G, re-measured 2026-06-25 / v0.10.9). `realistic_*_calibrated*` fixtures are production-proxy benchmarks calibrated against real traffic; the rest are synthetic stress tests.

### Throughput (QPS)

| Fixture | Go | Java | C++ |
|---|---|---|---|
| small_010 (10 items) | 37078 | 5825 | 20794 |
| small_050 (50 items) | 26976 | 5201 | 17244 |
| small_100 (100 items) | 19585 | 4748 | 13904 |
| medium_0100 (100 items) | 12025 | 3681 | 8578 |
| medium_0500 (500 items) | 2921 | 2034 | 2938 |
| medium_1000 (1000 items) | 1446 | 1360 | 1647 |
| large_0100 (100 items) | 6395 | 2855 | 4855 |
| large_0500 (500 items) | 1439 | 1439 | 1671 |
| large_1000 (1000 items) | 728 | 917 | 902 |
| large_5000 (5000 items) | 142 | 212 | 174 |
| **realistic_calibrated (production proxy)** | **120** | **124** | **221** |
| small_010 (10 items) | 36298 | 6318 | 20756 |
| small_050 (50 items) | 27270 | 5336 | 17227 |
| small_100 (100 items) | 19658 | 4607 | 13812 |
| medium_0100 (100 items) | 12514 | 3589 | 8542 |
| medium_0500 (500 items) | 3026 | 1965 | 2941 |
| medium_1000 (1000 items) | 1513 | 1295 | 1656 |
| large_0100 (100 items) | 7243 | 3064 | 5120 |
| large_0500 (500 items) | 1684 | 1508 | 1773 |
| large_1000 (1000 items) | 825 | 966 | 951 |
| large_5000 (5000 items) | 155 | 213 | 175 |
| realistic_for_you | 483 | 303 | 349 |
| realistic_for_you_latency | 250 | 141 | 212 |
| **realistic_for_you_calibrated (production proxy)** | **121** | **127** | **237** |
| **realistic_for_you_calibrated_2c4g** | **121** | **124** | **224** |
| **realistic_for_you_calibrated_itemlua** | **127** | **126** | **233** |

### P50 Latency (ms)

| Fixture | Go | Java | C++ |
|---|---|---|---|
| small_010 | 0.3 | 2.0 | 0.6 |
| medium_0500 | 5.0 | 6.3 | 5.2 |
| large_1000 | 20.5 | 14.8 | 16.1 |
| large_5000 | 102.2 | 67.9 | 83.9 |
| **realistic_calibrated** | **123.6** | **121.9** | **65.0** |
| small_010 | 0.4 | 1.5 | 0.6 |
| medium_0500 | 4.9 | 6.8 | 5.3 |
| large_1000 | 18.2 | 14.3 | 15.3 |
| large_5000 | 94.3 | 68.6 | 83.4 |
| **realistic_for_you_calibrated** | **122.3** | **117.7** | **60.8** |
| **realistic_for_you_calibrated_itemlua** | **117.1** | **119.5** | **61.5** |

Highlights:

- **C++ leads by ~1.8x on the production-calibrated scenario** (QPS 221 vs 120/124; P50 65ms vs ~122ms) — this is what the "benchmark runtime" positioning means
- **C++ leads by ~1.9x on production-calibrated workloads** (calibrated QPS 237 vs 121/127; P50 60ms vs 117/122ms) — this is what the "benchmark runtime" positioning means
- Go has the highest throughput on synthetic small/medium fixtures (lowest lightweight-request overhead); Java's JIT hot-loop optimization wins at large row counts (large_1000+)
- Numbers evolve with versions. Reproduce with `scripts/bench-cross-runtime.sh --requests 10000 --concurrency 16`; reports land in `bench-results/`
- itemlua (3000 Lua calls/request, boundary-dominated shape) is statistically flat against calibrated across all three engines — confirms the "per-item boundary dominates + end-to-end dilution" calibration fact (see `llmdoc/memory/decisions/perf-evolution-roadmap.md`)
- Numbers evolve with versions. Reproduce with `make bench-cross-runtime` or `scripts/bench-cross-runtime.sh --requests 10000 --concurrency 16`; reports land in `bench-results/`

## Documentation

Expand All @@ -429,6 +383,7 @@ Highlights:
| Operator development | [`doc/guide_operator-en.md`](doc/guide_operator-en.md) — Go operator development guide |
| Third-party extensions | [`design_doc/12_distribution-en.md`](design_doc/12_distribution-en.md) — Add custom operators without modifying source |
| API reference | [`doc/api-en.md`](doc/api-en.md) — HTTP endpoint documentation |
| LLM retrieval docs | [`llmdoc/`](llmdoc/) — Stable knowledge map for AI collaboration (architecture / decisions / reflections / index) |

## License

Expand Down
Loading