From b4bc0c56b6ea7748aa59ca9ee98991a151b49499 Mon Sep 17 00:00:00 2001 From: Liam Date: Thu, 25 Jun 2026 09:09:35 +0800 Subject: [PATCH 1/4] =?UTF-8?q?docs(readme):=20drop=20v0.7=20migration=20s?= =?UTF-8?q?ection=20(=E5=8E=86=E5=8F=B2=E5=8C=85=E8=A2=B1=E5=B7=B2?= =?UTF-8?q?=E6=88=90=20stale)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit v0.7 把 Go 引擎搬进 pine-go/ 子目录 + row_dependency→consumes_row_set + barrier→marker interface + Field Accessor 三态翻转,这些破坏性变更 到 v0.10 已经过去 3 个 minor + 14 个 patch 版本,生产用户都早已迁移 完成。继续放在 README 主线只会让首屏被 80 行历史变更挡住,新读者 找不到核心特性。 迁移指南本身没失效,但归档价值已耗尽——查 git log / CHANGELOG.md 或 design_doc 即可。design_doc 仍持有完整的语义描述(05_operator_ types.md 的 consumes_row_set DSL 字段、04_operator_registration.md 的注册形态、03_xxx 的 DAG 调度模型),不构成知识丢失。 --- README-en.md | 83 --------------------------------------------------- README.md | 84 ---------------------------------------------------- 2 files changed, 167 deletions(-) diff --git a/README-en.md b/README-en.md index 8d182bcf..2aa004ac 100644 --- a/README-en.md +++ b/README-en.md @@ -46,89 +46,6 @@ Python DSL (Apple) ──compile──> JSON Config - **Tri-engine consistency** — Go/Java/C++ engines verified via CI cross-validation for schema, DAG, execution, error, server, and metrics parity - **Pine-C++ benchmark runtime** — Complete third runtime with operator parity, HTTP server (hot reload / graceful shutdown), ColumnFrame/RowFrame dual physical layouts, lazy OperatorInput projection, LuaJIT integration, metrics/resource parity -## Migrating from Older Versions (Breaking Change) - -> Starting from v0.7, the Go engine has moved from the repository root into the `pine-go/` subdirectory. The Go module path has changed accordingly. - -### What Changed - -| Item | Before | After | -|------|--------|-------| -| Module path | `github.com/Liam0205/pineapple` | `github.com/Liam0205/pineapple/pine-go` | -| Import | `github.com/Liam0205/pineapple/internal/...` | `github.com/Liam0205/pineapple/pine-go/internal/...` | -| Import | `github.com/Liam0205/pineapple/pkg/...` | `github.com/Liam0205/pineapple/pine-go/pkg/...` | -| Import | `github.com/Liam0205/pineapple/operators` | `github.com/Liam0205/pineapple/pine-go/operators` | -| Binary | `go build ./cmd/pineapple-server` | `go build ./pine-go/cmd/pineapple-server` | - -### Migration Steps - -```bash -# 1. Bulk-replace import paths -find . -name '*.go' -exec sed -i \ - 's|github.com/Liam0205/pineapple/|github.com/Liam0205/pineapple/pine-go/|g' {} + - -# 2. Fix double-nesting if you referenced the module itself -find . -name '*.go' -exec sed -i \ - 's|github.com/Liam0205/pineapple/pine-go/pine-go/|github.com/Liam0205/pineapple/pine-go/|g' {} + - -# 3. Update go.mod -go get github.com/Liam0205/pineapple/pine-go@latest -go mod tidy -``` - -If your project uses Pineapple through public APIs (`pine.NewEngine`, `pine.BuildOperator`, etc.), the above steps complete the migration. - -### Configuration & Runtime Semantic Changes - -The following changes affect JSON configuration and operator runtime behavior: - -#### 1. `row_dependency` Renamed to `consumes_row_set` - -The `"row_dependency": true` field in operator JSON config has been removed. Use `"consumes_row_set": true` instead (same semantics: marks the operator as needing a stable row set before execution). - -```diff - { - "type_name": "transform_size", -- "row_dependency": true, -+ "consumes_row_set": true, - "$metadata": { ... } - } -``` - -Apple DSL side: `OpCall(..., row_dependency=True)` → `OpCall(..., consumes_row_set=True)`. - -#### 2. DAG Scheduling Model: Barriers → Row-Set Marker Interfaces - -Previously, Filter/Merge/Reorder operators acted as "barriers" — all predecessors had to complete before them, and all successors had to wait. - -The new model uses three marker interfaces for precise row-set dependency declaration: - -| Marker | Meaning | Typical Operators | -|--------|---------|-------------------| -| `ConsumesRowSet` | Iterates all items; needs row set stable | filter_*, merge_*, reorder_*, transform_size | -| `MutatesRowSet` | Removes or reorders items | filter_*, merge_*, reorder_* | -| `AdditiveWritesRowSet` | Appends items (parallel with other appenders) | recall_* | - -**Impact**: Transform operators that only touch common fields are no longer blocked by barriers and can execute in parallel with Filter/Merge/Reorder. This improves parallelism without changing final results — correctness is guaranteed by field-level data hazard analysis. - -**Custom operator migration**: If you implemented a custom Recall-type operator, embed `types.AdditiveWritesRowSetMarker`. - -#### 3. Field Accessor Strict Mode - -`BuildInput` now distinguishes Strict vs. Defaulted fields: - -- **Strict** (fields without a `common_defaults` / `item_defaults` entry): errors immediately at runtime if the value is nil, instead of passing nil to the operator -- **Defaulted** (fields with a default): substitutes the default when the value is nil or missing - -**Impact**: If your pipeline relies on "nil passthrough to operator for self-handling", add a `common_defaults` or `item_defaults` entry for that field (value can be `null`) to preserve the old behavior: - -```json -{ - "$metadata": { "common_input": ["optional_field"], ... }, - "common_defaults": { "optional_field": null } -} -``` - ## Quick Start ### Prerequisites diff --git a/README.md b/README.md index 91a4b7cf..7c89788a 100644 --- a/README.md +++ b/README.md @@ -46,90 +46,6 @@ Python DSL (Apple) ──compile──> JSON Config - **三引擎一致性** — Go/Java/C++ 引擎通过 CI 交叉验证保证 schema、DAG、执行结果、错误消息一致 - **Pine-C++ 标杆运行时** — 完整第三运行时,内置算子与 Go/Java 完全对等、HTTP server(热加载/graceful shutdown)、ColumnFrame/RowFrame 双物理实现、OperatorInput lazy 投影、LuaJIT 集成、metrics/resource 对等 -## 从旧版迁移(Breaking Change) - -> 自 v0.7 起,Go 引擎从仓库根目录迁移至 `pine-go/` 子目录,Go module path 随之变更。 - -### 变更内容 - -| 项目 | 迁移前 | 迁移后 | -|------|--------|--------| -| Module path | `github.com/Liam0205/pineapple` | `github.com/Liam0205/pineapple/pine-go` | -| Import | `github.com/Liam0205/pineapple/internal/...` | `github.com/Liam0205/pineapple/pine-go/internal/...` | -| Import | `github.com/Liam0205/pineapple/pkg/...` | `github.com/Liam0205/pineapple/pine-go/pkg/...` | -| Import | `github.com/Liam0205/pineapple/operators` | `github.com/Liam0205/pineapple/pine-go/operators` | -| Binary | `go build ./cmd/pineapple-server` | `go build ./pine-go/cmd/pineapple-server` | - -### 下游迁移步骤 - -```bash -# 1. 批量替换 import path -find . -name '*.go' -exec sed -i \ - 's|github.com/Liam0205/pineapple/|github.com/Liam0205/pineapple/pine-go/|g' {} + - -# 2. 修正 module 自身的引用(避免多余的 pine-go/pine-go) -find . -name '*.go' -exec sed -i \ - 's|github.com/Liam0205/pineapple/pine-go/pine-go/|github.com/Liam0205/pineapple/pine-go/|g' {} + - -# 3. 更新 go.mod -go get github.com/Liam0205/pineapple/pine-go@latest -go mod tidy -``` - -如果你的项目通过 `pine.NewEngine` / `pine.BuildOperator` 等公共 API 使用 Pineapple,上述步骤即可完成迁移。 - -### 配置与运行时语义变更 - -以下变更影响 JSON 配置和算子运行时行为: - -#### 1. `row_dependency` 重命名为 `consumes_row_set` - -JSON 配置中算子的 `"row_dependency": true` 字段已移除,改用 `"consumes_row_set": true`(语义不变:标记算子需要等待行集稳定后才执行)。 - -```diff - { - "type_name": "transform_size", -- "row_dependency": true, -+ "consumes_row_set": true, - "$metadata": { ... } - } -``` - -Apple DSL 侧同步变更:`OpCall(..., row_dependency=True)` → `OpCall(..., consumes_row_set=True)`。 - -#### 2. DAG 调度模型变更:barrier → row-set marker interfaces - -旧模型中 Filter/Merge/Reorder 算子被视为"barrier"——在它们执行前所有前驱必须完成,所有后继必须等它完成。 - -新模型通过三个 marker interface 精确声明 row-set 依赖: - -| Marker | 含义 | 典型算子 | -|--------|------|----------| -| `ConsumesRowSet` | 迭代所有 item,需要行集稳定 | filter_*, merge_*, reorder_*, transform_size | -| `MutatesRowSet` | 删除或重排 item | filter_*, merge_*, reorder_* | -| `AdditiveWritesRowSet` | 追加 item(与其他追加者并行) | recall_* | - -**影响**:仅操作 common 字段的 Transform 算子不再被 barrier 阻塞,可与 Filter/Merge/Reorder 并行执行。这提升了并行度但不改变最终结果——正确性由字段级数据冒险分析保证。 - -**自定义算子迁移**:如果你实现了自定义的 Recall 类型算子,需要嵌入 `types.AdditiveWritesRowSetMarker`。 - -#### 3. Field Accessor 三态模型 - -`BuildInput` 现在支持三种字段模式: - -- **Nullable**(默认):字段缺失时报错,值为 nil 时透传给算子 -- **Strict**(通过 `strict_common` / `strict_item` 声明):值为 nil 时立即报错 -- **Defaulted**(通过 `common_defaults` / `item_defaults` 声明):值为 nil 或缺失时替换为默认值 - -**影响**:v0.9.0 起默认模式从 Strict 变为 Nullable。如果你的流水线依赖"nil 值必须报错"的行为,需要在配置中声明 `strict_common` / `strict_item`: - -```json -{ - "$metadata": { "common_input": ["required_field"], ... }, - "strict_common": ["required_field"] -} -``` - ## Quick Start ### 环境要求 From a0bc690c9a6735639601ce4517f75fb95eddf7a3 Mon Sep 17 00:00:00 2001 From: Liam Date: Thu, 25 Jun 2026 09:10:32 +0800 Subject: [PATCH 2/4] =?UTF-8?q?docs(readme):=20refresh=20=E6=A0=B8?= =?UTF-8?q?=E5=BF=83=E7=89=B9=E6=80=A7=20=E2=80=94=20wangshu=20/=20Redis?= =?UTF-8?q?=20cascade-safety=20/=20/stats=20fan-out?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Core features list was last refreshed pre-v0.10; this batch syncs to v0.10.9 reality: - Lua: explicit pine-go default = wangshu (with build-tag escape to gopher-lua), pine-java = LuaJC, pine-cpp = LuaJIT. The default flip landed in v0.10 series. - Resources: split into data-typed (snapshot) vs handle-typed (borrow + RAII teardown) — `redis_connection` is the canonical handle-typed resource. The old one-liner "background-refreshed in-memory resource manager" hides the architecture. - Redis: add dedicated bullet for the 5 cascade-safety params and the 4-state per-command metrics + fail-on-error contract. These shipped in #137 and are production-load-bearing. - Observability: /stats is no longer just a single endpoint — call out /stats.http and /stats.resources sub-trees so readers can find the resource fan-out and HTTP middleware metrics. - Cross-validation bullet: name the actual verification surface (19 cross-validate sections + differential fuzz + daily sanitized fuzz) instead of the vague "verified for schema/DAG/exec/error parity". EN side kept structurally aligned with the CN edits. --- README-en.md | 9 +++++---- README.md | 9 +++++---- 2 files changed, 10 insertions(+), 8 deletions(-) diff --git a/README-en.md b/README-en.md index 2aa004ac..1fce22c2 100644 --- a/README-en.md +++ b/README-en.md @@ -38,12 +38,13 @@ Python DSL (Apple) ──compile──> JSON Config - **Implicit graph construction** — Operators declare input/output fields; engine infers DAG dependencies with transitive reduction - **Lock-free parallelism** — Independent operators in the DAG execute in parallel automatically - **Compile-time validation** — Dead code, missing fields, write-after-write detected before deployment -- **Embedded Lua** — Built-in Lua operators for lightweight custom computation. End-to-end overhead ~1.2-2x; isolated operator-level overhead varies by runtime and compute complexity (C++/LuaJIT ~3-5x, Java ~2-9x, Go ~6-17x) — write native operators for compute-heavy hot paths +- **Embedded Lua** — Built-in Lua operators for lightweight custom computation. pine-go defaults to [wangshu](https://github.com/Liam0205/wangshu) (pure-Go Lua 5.1 VM, NaN-boxing + arena GC); switch back to gopher-lua via `-tags=lua_gopher`. pine-java uses LuaJC (bytecode compilation), pine-cpp uses LuaJIT. End-to-end overhead ~1.2-2x; isolated operator-level overhead varies by runtime and compute complexity (C++/LuaJIT ~3-5x, Java ~2-9x, Go ~6-17x) — write native operators for compute-heavy hot paths - **Hot config reload** — Service automatically reloads engine config without downtime -- **Dynamic resources** — Background-refreshed in-memory resource manager with lock-free reads -- **White-box observability** — Operator-level traces, `/stats` endpoint, pluggable Prometheus interface +- **Dynamic resources** — Two-channel resource manager: **data-typed** (e.g. static dict / real-time feature store, snapshot-exported lock-free reads) + **handle-typed** (e.g. `redis_connection`, borrow lease + RAII teardown); background-refreshed +- **Redis cascade-safety** — The `redis_connection` resource exposes 5 cascade params (`{dial,read,write,pool}_timeout_ms` + `pool_size`); per-command metrics `pine_redis_command_*` with 4-state status (ok / timeout / pool_timeout / error), fail-on-error silent-degradation contract +- **White-box observability** — Operator-level traces; the `/stats` composite response includes `/stats.http` (request-level 4-state metrics) + `/stats.resources` (resource pool / probe / per-command 4-state categories); pluggable Prometheus interface - **Row/Column storage** — DataFrame supports both storage modes -- **Tri-engine consistency** — Go/Java/C++ engines verified via CI cross-validation for schema, DAG, execution, error, server, and metrics parity +- **Tri-engine consistency** — Go/Java/C++ engines verified byte-exactly via CI cross-validation (19 sections + tri-engine differential fuzz + daily ASan/TSan sanitized fuzz) - **Pine-C++ benchmark runtime** — Complete third runtime with operator parity, HTTP server (hot reload / graceful shutdown), ColumnFrame/RowFrame dual physical layouts, lazy OperatorInput projection, LuaJIT integration, metrics/resource parity ## Quick Start diff --git a/README.md b/README.md index 7c89788a..1245bcd0 100644 --- a/README.md +++ b/README.md @@ -38,12 +38,13 @@ Python DSL (Apple) ──compile──> JSON Config - **隐式构图** — 算子声明输入/输出字段,引擎自动推导 DAG 依赖并执行传递性归约 - **无锁并行** — DAG 中无依赖的算子自动并行执行 - **编译期校验** — 死代码、字段缺失、写后未读等问题在部署前拦截 -- **Lua 嵌入** — 内置 Lua 算子支持轻量自定义计算。端到端开销约 1.2-2x;隔离算子级开销随运行时与计算复杂度变化(C++/LuaJIT 约 3-5x、Java 约 2-9x、Go 约 6-17x),计算密集型热路径建议写原生算子 +- **Lua 嵌入** — 内置 Lua 算子支持轻量自定义计算。pine-go 默认 [wangshu](https://github.com/Liam0205/wangshu)(纯 Go Lua 5.1 VM,NaN-boxing + arena GC),可通过 `-tags=lua_gopher` 切回 gopher-lua;pine-java 用 LuaJC(字节码编译),pine-cpp 用 LuaJIT。端到端开销约 1.2-2x;隔离算子级开销随运行时与计算复杂度变化(C++/LuaJIT 约 3-5x、Java 约 2-9x、Go 约 6-17x),计算密集型热路径建议写原生算子 - **配置热加载** — 服务运行时自动无停机重载引擎配置 -- **动态资源** — 后台定时刷新的内存资源管理器,无锁读 -- **白盒可观测** — 算子级 trace、`/stats` 端点、可插拔 Prometheus 接口 +- **动态资源** — 双通道资源管理:**数据型**(如静态 dict / 实时 feature store,snapshot 导出后无锁读)+ **句柄型**(如 `redis_connection`,borrow 借用 + RAII 拆除);后台定时刷新 +- **Redis cascade-safety** — `redis_connection` 资源暴露 `{dial,read,write,pool}_timeout_ms` + `pool_size` 五参数,per-command 指标 `pine_redis_command_*`(4-state status:ok / timeout / pool_timeout / error),fail-on-error 静默降级契约 +- **白盒可观测** — 算子级 trace;`/stats` 组合响应含 `/stats.http`(请求级 4-state 指标)+ `/stats.resources`(资源池连接池/探针/per-command 4 状态分类);可插拔 Prometheus 接口 - **行存/列存可切换** — DataFrame 支持两种存储模式 -- **三引擎一致性** — Go/Java/C++ 引擎通过 CI 交叉验证保证 schema、DAG、执行结果、错误消息一致 +- **三引擎一致性** — Go/Java/C++ 引擎通过 CI 交叉验证保证 schema、DAG、执行结果、错误消息字节级一致(19 section cross-validate + 三引擎差分 fuzz + 每日 ASan/TSan sanitized fuzz) - **Pine-C++ 标杆运行时** — 完整第三运行时,内置算子与 Go/Java 完全对等、HTTP server(热加载/graceful shutdown)、ColumnFrame/RowFrame 双物理实现、OperatorInput lazy 投影、LuaJIT 集成、metrics/resource 对等 ## Quick Start From d22b1de9c81aa56d1d46decd65a677e4796f454c Mon Sep 17 00:00:00 2001 From: Liam Date: Thu, 25 Jun 2026 09:12:38 +0800 Subject: [PATCH 3/4] docs(readme): add Makefile / githooks / llmdoc + daily sanitized fuzz CI row MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The README has grown out of sync with how dev work actually happens now. Three additions: 1. **Makefile section** at the top of the dev block. The top-level Makefile + pine-go/Makefile are the actual unified entry — CI and local share the exact verb sequence. Up to now scripts/ was the only surface readers saw, which understates the project's task plumbing. Cross-checked the listed verbs against `make` output: all exist. 2. **Local Git Hooks section**. .githooks/{pre-commit,pre-push} ships in-tree and is the source of three concrete dev ergonomics: staged-only format gate (no surprise overwrites), four-language lint on push, and the auto `--set-upstream` relay landed in pine #139 (absorbed from wangshu#24 / ctex-kit#888). Without docs, first-time contributors miss `core.hooksPath` and lose the lint gate. 3. **Daily sanitized fuzz** added to the CI list — promoted from weekly to daily in ef24382c (#109) and load-bearing for race / memory-bug surveillance separate from the per-push fuzz fast lane. Also added `llmdoc/` to the Documentation table since it is now the canonical AI-collaboration knowledge map and constantly referenced from issue comments. clang-format -Werror clarified as the actual C++ lint form (the bare "-Werror" was ambiguous). EN side kept structurally aligned. --- README-en.md | 35 +++++++++++++++++++++++++++++++++-- README.md | 35 +++++++++++++++++++++++++++++++++-- 2 files changed, 66 insertions(+), 4 deletions(-) diff --git a/README-en.md b/README-en.md index 1fce22c2..98749a09 100644 --- a/README-en.md +++ b/README-en.md @@ -157,8 +157,28 @@ pineapple/ ## Development +### Top-level Make Targets + +Cross-language fmt / lint / test / bench / codegen / version management is unified behind the top-level `Makefile` (with `pine-go/Makefile` for Go-specific work). CI and local dev share the same command sequence. + +| Make target | Purpose | +|---|---| +| `make fmt` | Format all four languages (gofmt / google-java-format / clang-format / ruff) | +| `make lint` | Lint all four languages (incl. checkstyle `failOnViolation=true`, `-Werror`) | +| `make test` | Full test suite across runtimes | +| `make bench` | Default `pine_bench` tag | +| `make bench-cross-runtime` | Cross-engine fixture-driven benchmark (cgroup-isolated) | +| `make bench-lua-backends` | wangshu vs gopher-lua, same-host serial + benchstat | +| `make differential-fuzz` | Tri-engine differential fuzz | +| `make cross-validate` | Tri-engine consistency verification | +| `make codegen` | Generate `apple_generated/` + `doc/operators/` from pine-go Registry | +| `make codegen-check` | CI: codegen + `git diff --exit-code` to enforce artifact freshness | +| `make check-pr-ci` | Watch CI status of the current branch's PR (pre-push hook calls this) | + ### Scripts +`scripts/` holds the actual implementations behind the Make targets and can be invoked standalone: + | Script | Purpose | |--------|---------| | `scripts/go-test.sh` | Run all Go tests | @@ -168,6 +188,7 @@ pineapple/ | `scripts/go-bench.sh` | Go benchmarks | | `scripts/java-bench.sh` | Java benchmarks | | `scripts/bench-cross-runtime.sh` | Cross-engine HTTP server benchmark (fixture-driven, cgroup-isolated) | +| `scripts/bench-lua-backends.sh` | wangshu vs gopher-lua backend comparison (benchstat delta) | | `scripts/go-fuzz.sh` | Go fuzz testing | | `scripts/java-fuzz.sh` | Java fuzz testing | | `scripts/differential-fuzz.sh` | Tri-engine differential fuzzing (random pipelines, output diff) | @@ -178,16 +199,25 @@ pineapple/ | `scripts/render-dag.sh` | DAG visualization (`--backend go\|java`) | | `scripts/apple-compile.sh` | Compile Apple DSL to JSON | | `scripts/run-pipeline.sh` | One-shot pipeline execution | -| `scripts/bump-version.sh` | Synchronize version across all components | +| `scripts/bump-version.sh` | Synchronize version across all components (incl. pine-cpp `kVersion`) | +| `scripts/check-pr-ci.sh` | Watch CI status of the current branch's PR (pre-push hook invokes this) | + +### Local Git Hooks + +`.githooks/` ships with the repository; activate via `git config core.hooksPath .githooks` once after clone: + +- **`pre-commit`** — staged-only format gate (gofmt / clang-format / ruff); does not touch unstaged work +- **`pre-push`** — project-level lint (four-language fail-on-violation) + self-wrapped post-push CI watcher (auto-runs `check-pr-ci.sh` after the actual push) + auto `--set-upstream` relay (first-push of a new branch does not need a manual `-u`) ### CI Pipeline CI runs automatically on every push/PR: -- **Lint** — Go (golangci-lint), Java (checkstyle, failOnViolation=true), Python (ruff), C++ (-Werror) +- **Lint** — Go (golangci-lint), Java (checkstyle, failOnViolation=true), Python (ruff), C++ (clang-format -Werror) - **Test** — Full Go/Java/Apple/C++ test suites with coverage - **Sanitizer** — C++ ASan/UBSan smoke + ThreadSanitizer stress - **Fuzz** — Go/Java fuzz + tri-engine differential fuzzing +- **Daily sanitized fuzz** — Daily (12:00 UTC+8) ASan/TSan differential fuzz, 3000+2000 rounds, dedicated to race / memory-bug deep diagnostics (independent of the per-push fast lane) - **Benchmark** — Go/Java performance benchmarks - **Cross-validation** — Tri-engine schema/DAG/execution/error/server/metrics parity - **Codegen check** — Ensures generated code is in sync with source @@ -347,6 +377,7 @@ Highlights: | Operator development | [`doc/guide_operator-en.md`](doc/guide_operator-en.md) — Go operator development guide | | Third-party extensions | [`design_doc/12_distribution-en.md`](design_doc/12_distribution-en.md) — Add custom operators without modifying source | | API reference | [`doc/api-en.md`](doc/api-en.md) — HTTP endpoint documentation | +| LLM retrieval docs | [`llmdoc/`](llmdoc/) — Stable knowledge map for AI collaboration (architecture / decisions / reflections / index) | ## License diff --git a/README.md b/README.md index 1245bcd0..932275bf 100644 --- a/README.md +++ b/README.md @@ -172,8 +172,28 @@ pineapple/ - **Cross-validate**:全 section 接入,三引擎一致性验证 +### 开发任务入口(Makefile) + +跨四语言的 fmt / lint / test / bench / codegen / 版本管理统一通过顶层 `Makefile` + `pine-go/Makefile` 暴露,CI 与本地共用同一命令序列。常用 verb: + +| Make 目标 | 用途 | +|---|---| +| `make fmt` | 四语言格式化(gofmt / google-java-format / clang-format / ruff) | +| `make lint` | 四语言 lint(含 checkstyle `failOnViolation=true`、`-Werror`) | +| `make test` | 全引擎测试 | +| `make bench` | 默认 `pine_bench` tag | +| `make bench-cross-runtime` | 跨引擎 fixture 驱动 benchmark(cgroup 隔离) | +| `make bench-lua-backends` | wangshu vs gopher-lua 同机串行连跑 + benchstat | +| `make differential-fuzz` | 三引擎差分 fuzz | +| `make cross-validate` | 跨引擎一致性验证 | +| `make codegen` | 从 pine-go Registry 生成 `apple_generated/` + `doc/operators/` | +| `make codegen-check` | CI 用:codegen 后 `git diff --exit-code`,确保产物新鲜 | +| `make check-pr-ci` | watch 当前分支 PR 的 CI 状态(pre-push hook 也会自动调用) | + ### 常用脚本 +`scripts/` 下的脚本是 Make 目标的具体实现,可单独调用: + | 脚本 | 用途 | |------|------| | `scripts/go-test.sh` | Go 全量测试 | @@ -183,6 +203,7 @@ pineapple/ | `scripts/go-bench.sh` | Go 性能基准 | | `scripts/java-bench.sh` | Java 性能基准 | | `scripts/bench-cross-runtime.sh` | 跨引擎 HTTP server benchmark(fixture 驱动,cgroup 资源隔离) | +| `scripts/bench-lua-backends.sh` | wangshu vs gopher-lua 后端对比(benchstat delta) | | `scripts/go-fuzz.sh` | Go fuzz 测试 | | `scripts/java-fuzz.sh` | Java fuzz 测试 | | `scripts/differential-fuzz.sh` | 三引擎差异模糊测试(随机生成 pipeline 比对输出) | @@ -193,16 +214,25 @@ pineapple/ | `scripts/render-dag.sh` | DAG 可视化(`--backend go\|java`) | | `scripts/apple-compile.sh` | Apple DSL 编译为 JSON | | `scripts/run-pipeline.sh` | 单次执行 pipeline | -| `scripts/bump-version.sh` | 版本号同步更新 | +| `scripts/bump-version.sh` | 版本号同步更新(含 pine-cpp `kVersion`) | +| `scripts/check-pr-ci.sh` | watch 当前分支 PR 的 CI 状态(pre-push hook 自动调用) | + +### 本地 Git Hooks + +仓库内置 `.githooks/` 用 `git config core.hooksPath .githooks` 挂载即生效(首次 clone 后建议配一次): + +- **`pre-commit`** — staged-only 格式 gate(gofmt / clang-format / ruff),不动未 staged 改动 +- **`pre-push`** — 工程级 lint(四语言 fail-on-violation)+ 自包装 CI watch(push 完成后自动起 `check-pr-ci.sh` 等终态)+ 自动 `--set-upstream` 接力(首次 push 新分支无需手动 `-u`) ### CI 流水线 CI 在每次 push/PR 时自动运行: -- **Lint** — Go (golangci-lint)、Java (checkstyle, failOnViolation=true)、Python (ruff)、C++ (-Werror) +- **Lint** — Go (golangci-lint)、Java (checkstyle, failOnViolation=true)、Python (ruff)、C++ (clang-format -Werror) - **Test** — Go/Java/Apple/C++ 全量测试 + 覆盖率 - **Sanitizer** — C++ ASan/UBSan 冒烟 + ThreadSanitizer 高并发压测 - **Fuzz** — Go/Java fuzz + 三引擎差异模糊测试 +- **Daily sanitized fuzz** — 每日(北京时间 12:00)跑 ASan/TSan 加持的差分 fuzz 3000+2000 轮,专门面向 race / memory bug 的 deep-diagnostic(独立于每次 push 的 fast 路径) - **Benchmark** — Go/Java 性能基准 - **Cross-validation** — 三引擎 schema/DAG/执行/错误/server/metrics 一致性 - **Codegen check** — 确保生成代码与源码同步 @@ -363,6 +393,7 @@ def normalize_json(text): | 算子开发 | [`doc/guide_operator.md`](doc/guide_operator.md) — Go 算子开发指南 | | 第三方扩展 | [`design_doc/12_distribution.md`](design_doc/12_distribution.md) — 不修改源码添加自定义算子 | | API 参考 | [`doc/api.md`](doc/api.md) — HTTP 接口说明 | +| LLM 检索文档 | [`llmdoc/`](llmdoc/) — 面向 AI 协作的稳定知识地图(架构 / 决策 / 反思 / 索引) | ## License From 59fc37ff7a70b62cd59b7245569b4deecd4a2312 Mon Sep 17 00:00:00 2001 From: Liam Date: Thu, 25 Jun 2026 09:31:28 +0800 Subject: [PATCH 4/4] docs(readme): refresh benchmark tables with 2026-06-25 v0.10.9 numbers MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Re-ran the full 14-fixture cross-runtime bench on the standard 2C/4G cgroup (10000 req × 16 conc) on the same machine that produced the previous 2026-06-11 numbers. v0.10 series picked up wangshu CallInto / GlobalsSlot fast paths, outputPool (#119), and Redis cascade-safety (#137) — re-baseline so the README reflects measured reality. Changes worth calling out: - Three calibrated fixtures now listed instead of one. Until now the README collapsed the calibrated family to a single row, hiding the itemlua variant entirely. itemlua (3000 Lua calls/request) is the boundary-dominated workload that anchors the perf-evolution-roadmap "calibration fact 2 — end-to-end dilution" finding; it deserves to show up. - C++ headline lift: 1.8x → 1.9x against Go/Java on calibrated. P50 60.8ms vs 117/122ms is the more legible framing than the QPS ratio. - Synthetic small/medium movements are all within ±10 % run-to-run noise; the relative shape (Go highest at small, Java reverses on large_1000+) is unchanged. - Reproduce command now lists `make bench-cross-runtime` first. Source data: bench-results/report-20260625-090834.txt --- README-en.md | 44 +++++++++++++++++++++++++------------------- README.md | 46 ++++++++++++++++++++++++++-------------------- 2 files changed, 51 insertions(+), 39 deletions(-) diff --git a/README-en.md b/README-en.md index 98749a09..39933533 100644 --- a/README-en.md +++ b/README-en.md @@ -333,39 +333,45 @@ See `scripts/cross-validate.sh` for a complete production implementation. ## Benchmark -Cross-engine performance comparison (HTTP server mode, `scripts/bench-cross-runtime.sh`, 10000 requests × 16 concurrency, server cgroup-isolated to 2C/4G). `realistic_calibrated` is a production proxy fixture calibrated against real traffic; the rest are synthetic stress tests. +Cross-engine performance comparison (HTTP server mode, `scripts/bench-cross-runtime.sh`, 10000 requests × 16 concurrency, server cgroup-isolated to 2C/4G, re-measured 2026-06-25 / v0.10.9). `realistic_*_calibrated*` fixtures are production-proxy benchmarks calibrated against real traffic; the rest are synthetic stress tests. ### Throughput (QPS) | Fixture | Go | Java | C++ | |---|---|---|---| -| small_010 (10 items) | 37078 | 5825 | 20794 | -| small_050 (50 items) | 26976 | 5201 | 17244 | -| small_100 (100 items) | 19585 | 4748 | 13904 | -| medium_0100 (100 items) | 12025 | 3681 | 8578 | -| medium_0500 (500 items) | 2921 | 2034 | 2938 | -| medium_1000 (1000 items) | 1446 | 1360 | 1647 | -| large_0100 (100 items) | 6395 | 2855 | 4855 | -| large_0500 (500 items) | 1439 | 1439 | 1671 | -| large_1000 (1000 items) | 728 | 917 | 902 | -| large_5000 (5000 items) | 142 | 212 | 174 | -| **realistic_calibrated (production proxy)** | **120** | **124** | **221** | +| small_010 (10 items) | 36298 | 6318 | 20756 | +| small_050 (50 items) | 27270 | 5336 | 17227 | +| small_100 (100 items) | 19658 | 4607 | 13812 | +| medium_0100 (100 items) | 12514 | 3589 | 8542 | +| medium_0500 (500 items) | 3026 | 1965 | 2941 | +| medium_1000 (1000 items) | 1513 | 1295 | 1656 | +| large_0100 (100 items) | 7243 | 3064 | 5120 | +| large_0500 (500 items) | 1684 | 1508 | 1773 | +| large_1000 (1000 items) | 825 | 966 | 951 | +| large_5000 (5000 items) | 155 | 213 | 175 | +| realistic_for_you | 483 | 303 | 349 | +| realistic_for_you_latency | 250 | 141 | 212 | +| **realistic_for_you_calibrated (production proxy)** | **121** | **127** | **237** | +| **realistic_for_you_calibrated_2c4g** | **121** | **124** | **224** | +| **realistic_for_you_calibrated_itemlua** | **127** | **126** | **233** | ### P50 Latency (ms) | Fixture | Go | Java | C++ | |---|---|---|---| -| small_010 | 0.3 | 2.0 | 0.6 | -| medium_0500 | 5.0 | 6.3 | 5.2 | -| large_1000 | 20.5 | 14.8 | 16.1 | -| large_5000 | 102.2 | 67.9 | 83.9 | -| **realistic_calibrated** | **123.6** | **121.9** | **65.0** | +| small_010 | 0.4 | 1.5 | 0.6 | +| medium_0500 | 4.9 | 6.8 | 5.3 | +| large_1000 | 18.2 | 14.3 | 15.3 | +| large_5000 | 94.3 | 68.6 | 83.4 | +| **realistic_for_you_calibrated** | **122.3** | **117.7** | **60.8** | +| **realistic_for_you_calibrated_itemlua** | **117.1** | **119.5** | **61.5** | Highlights: -- **C++ leads by ~1.8x on the production-calibrated scenario** (QPS 221 vs 120/124; P50 65ms vs ~122ms) — this is what the "benchmark runtime" positioning means +- **C++ leads by ~1.9x on production-calibrated workloads** (calibrated QPS 237 vs 121/127; P50 60ms vs 117/122ms) — this is what the "benchmark runtime" positioning means - Go has the highest throughput on synthetic small/medium fixtures (lowest lightweight-request overhead); Java's JIT hot-loop optimization wins at large row counts (large_1000+) -- Numbers evolve with versions. Reproduce with `scripts/bench-cross-runtime.sh --requests 10000 --concurrency 16`; reports land in `bench-results/` +- itemlua (3000 Lua calls/request, boundary-dominated shape) is statistically flat against calibrated across all three engines — confirms the "per-item boundary dominates + end-to-end dilution" calibration fact (see `llmdoc/memory/decisions/perf-evolution-roadmap.md`) +- Numbers evolve with versions. Reproduce with `make bench-cross-runtime` or `scripts/bench-cross-runtime.sh --requests 10000 --concurrency 16`; reports land in `bench-results/` ## Documentation diff --git a/README.md b/README.md index 932275bf..8194d751 100644 --- a/README.md +++ b/README.md @@ -349,39 +349,45 @@ def normalize_json(text): ## Benchmark -跨引擎性能对比(HTTP server 模式,`scripts/bench-cross-runtime.sh`,10000 请求 × 16 并发,server 以 2C/4G cgroup 隔离)。`realistic_calibrated` 为按真实流量校准的生产 proxy fixture,其余为合成压测。 +跨引擎性能对比(HTTP server 模式,`scripts/bench-cross-runtime.sh`,10000 请求 × 16 并发,server 以 2C/4G cgroup 隔离,2026-06-25 / v0.10.9 复测)。`realistic_*_calibrated*` 系列为按真实流量校准的生产 proxy fixture,其余为合成压测。 ### 吞吐量 (QPS) | Fixture | Go | Java | C++ | |---|---|---|---| -| small_010 (10 items) | 37078 | 5825 | 20794 | -| small_050 (50 items) | 26976 | 5201 | 17244 | -| small_100 (100 items) | 19585 | 4748 | 13904 | -| medium_0100 (100 items) | 12025 | 3681 | 8578 | -| medium_0500 (500 items) | 2921 | 2034 | 2938 | -| medium_1000 (1000 items) | 1446 | 1360 | 1647 | -| large_0100 (100 items) | 6395 | 2855 | 4855 | -| large_0500 (500 items) | 1439 | 1439 | 1671 | -| large_1000 (1000 items) | 728 | 917 | 902 | -| large_5000 (5000 items) | 142 | 212 | 174 | -| **realistic_calibrated (生产校准)** | **120** | **124** | **221** | +| small_010 (10 items) | 36298 | 6318 | 20756 | +| small_050 (50 items) | 27270 | 5336 | 17227 | +| small_100 (100 items) | 19658 | 4607 | 13812 | +| medium_0100 (100 items) | 12514 | 3589 | 8542 | +| medium_0500 (500 items) | 3026 | 1965 | 2941 | +| medium_1000 (1000 items) | 1513 | 1295 | 1656 | +| large_0100 (100 items) | 7243 | 3064 | 5120 | +| large_0500 (500 items) | 1684 | 1508 | 1773 | +| large_1000 (1000 items) | 825 | 966 | 951 | +| large_5000 (5000 items) | 155 | 213 | 175 | +| realistic_for_you | 483 | 303 | 349 | +| realistic_for_you_latency | 250 | 141 | 212 | +| **realistic_for_you_calibrated (生产校准)** | **121** | **127** | **237** | +| **realistic_for_you_calibrated_2c4g** | **121** | **124** | **224** | +| **realistic_for_you_calibrated_itemlua** | **127** | **126** | **233** | ### P50 延迟 (ms) | Fixture | Go | Java | C++ | |---|---|---|---| -| small_010 | 0.3 | 2.0 | 0.6 | -| medium_0500 | 5.0 | 6.3 | 5.2 | -| large_1000 | 20.5 | 14.8 | 16.1 | -| large_5000 | 102.2 | 67.9 | 83.9 | -| **realistic_calibrated** | **123.6** | **121.9** | **65.0** | +| small_010 | 0.4 | 1.5 | 0.6 | +| medium_0500 | 4.9 | 6.8 | 5.3 | +| large_1000 | 18.2 | 14.3 | 15.3 | +| large_5000 | 94.3 | 68.6 | 83.4 | +| **realistic_for_you_calibrated** | **122.3** | **117.7** | **60.8** | +| **realistic_for_you_calibrated_itemlua** | **117.1** | **119.5** | **61.5** | 要点: -- **生产校准场景下 C++ 领先约 1.8x**(QPS 221 vs 120/124;P50 65ms vs ~122ms),这是"标杆运行时"定位的体现 -- 合成 small/medium 场景 Go 吞吐最高(轻量请求路径开销最低);大行数场景(large_1000+)Java 的 JIT 热循环优化使其反超 -- 各引擎数字会随版本演进,复现方式:`scripts/bench-cross-runtime.sh --requests 10000 --concurrency 16`,报告落在 `bench-results/` +- **生产校准场景下 C++ 领先约 1.9x**(calibrated QPS 237 vs 121/127;P50 60ms vs 117/122ms),这是"标杆运行时"定位的体现 +- 合成 small/medium 场景 Go 吞吐最高(轻量请求路径开销最低);大行数场景(large_1000+)Java 的 JIT 热循环优化反超 +- itemlua(3000 调用/请求的 boundary-dominated 形状)与 calibrated 在三引擎都统计持平,符合"per-item 边界主导 + 端到端稀释"的校准事实(详见 `llmdoc/memory/decisions/perf-evolution-roadmap.md`) +- 各引擎数字会随版本演进,复现方式:`make bench-cross-runtime` 或 `scripts/bench-cross-runtime.sh --requests 10000 --concurrency 16`,报告落在 `bench-results/` ## 文档