BXP - Developer Guide

For end-user documentation see resources/readme.md
For architecture diagrams see docs/architecture.md

Part 1 - Getting Started
Part 2 - Architecture and Internals

Part 1 - Getting Started

VS Code setup

Install these extensions for a productive experience:

Extension	ID	Purpose
Zig Language	`ziglang.vscode-zig`	Zig language, Syntax highlighting, ZLS integration, build tasks
Rainbow CSV	`mechatroner.rainbow-csv`	Column-aware CSV viewer - helpful when reading broker exports
JSON5	`blueglassblock.better-json5`	Syntax highlighting for `JSON5` config files
Mermaid preview	`bierner.markdown-mermaid`	Renders Mermaid diagrams in Markdown preview (useful for `architecture.md`)
Mermaid syntax	`bpruitt-goddard.mermaid-markdown-syntax-highlighting`	Syntax highlighting for Mermaid diagrams (useful for `architecture.md`)

ZLS (Zig Language Server) and Zig language is bundled with the ziglang.vscode-zig extension in recent versions - it provides completions, go-to-definition and inline error diagnostics out of the box.

Verify Zig language version

Tool	Version	Notes
Zig	0.15.2	Exact version - `build.zig.zon` sets `minimum_zig_version = "0.15.0"`

No other runtime dependencies. bxp-core fetches sunrise (datetime library) automatically via zig build on first run.

In VS Code terminal:

zig version
# expected: 0.15.2

Claude Code setup

BXP development in Zig works seamlessly with Claude Code. The monorepo ships CLAUDE.md files at the root, bxp-cli/, and bxp-core/ levels - Claude loads these automatically and reads project conventions.

Skills to use

Install Zig skills from https://github.com/rudedogg/zig-skills

Skill	When to use
`/zig`	Before writing any new Zig code - loads Zig 0.15.2 API patterns
`/zig-build`	Compile the project and get structured error analysis
`/zig-check`	Fast syntax/type check without full build
`/zig-test`	Run the test suite and analyze failures

Repository layout

bxp/                            # monorepo root (git root)
├── bxp-cli/                    # user-facing CLI binary
│   ├── src/
│   │   ├── main.zig            # arg parsing, config loading, dispatch
│   │   └── pipeline.zig        # processBroker(), xlsxPrePass(), Output, SectionStats
│   ├── build.zig               # imports bxp-core modules by name
│   └── build.zig.zon           # depends on bxp-core (path dep)
├── bxp-core/                   # internal shared library (no binary)
│   ├── src/
│   │   ├── csv.zig             # RFC 4180 CSV parser
│   │   ├── xlsx.zig            # .xlsx → CSV (ZIP+XML)
│   │   ├── expr.zig            # expression evaluator
│   │   ├── config.zig          # JSON5 config loader
│   │   ├── json.zig            # JSON array-of-objects → row representation
│   │   └── json5.zig           # JSON5 preprocessor (comments, unquoted keys, ...)
│   ├── build.zig               # exports named Zig modules
│   └── build.zig.zon           # depends on sunrise (url dep, auto-fetched)
├── datasets/                   # anonymized sample data + expected outputs
│   └── <template_id>/
│       ├── sample.csv          # .csv or .xlsx (then .csv is intermediate)
│       ├── sample.csvx         # final output file
│       └── sample.expected     # expected output - test.sh regression baseline (diff with csvx)
├── docs/
│   ├── devel.md                # this file
│   └── architecture.md         # Bird's-eye view, data flow, execution diagrams
├── resources/
│   ├── bxp-cli.examples.json   # example config (released alongside binary)
│   └── readme.md               # end-user documentation (released alongside binary)
├── scripts/
│   ├── test.sh                 # test suite (unit + regression)
│   └── release.sh              # cross-compile + package
└── README.md                   # basic readme about project

Clone and build

# Clone this repository
git clone https://github.com/zaxified/bxp.git

# Build bxp-cli (fetches dependencies on first run)
cd ./bxp/bxp-cli
zig build

# Run
./zig-out/bin/bxp-cli --help

Running bxp-cli without arguments processes every template defined in bxp-cli.json in the current working directory.
The typical dev workflow:

# From the monorepo root
./bxp-cli/zig-out/bin/bxp-cli --config ./datasets/anycoin_to_wealthfolio/sample.json --debug

Run the test suite

# From the monorepo root - runs unit tests + all regression tests
bash scripts/test.sh

The test script:

Runs zig build test in bxp-core (unit tests for csv.zig, expr.zig, json5.zig).
Builds bxp-cli.
Iterates every datasets/<id>/ directory, runs bxp-cli against the sample inputs, and diffs the output against sample.expected.

Individual unit tests only:

cd bxp-core && zig build test

Part 2 - Architecture and Internals

See docs/architecture.md for visual diagrams.

Design philosophy

BXP is a configuration-driven ETL micro-tool. The core principle is:

Adding a new data source = writing a JSON5 template. No code, no recompilation.

Consequences of this design:

All broker-specific logic lives in bxp-cli.json (conversion_templates section).
bxp-core is a generic engine: CSV/XLSX parser, expression evaluator, config loader.
bxp-cli is a thin orchestrator: reads config, finds files, calls the engine.
The expression language is intentionally limited - it handles per-row transformations, not general-purpose computation.

Package dependency graph

  bxp-cli  ── path dep ──►  bxp-core  ── url dep ──►  sunrise
  (binary)                  (library)                 (datetime)

bxp-core is referenced as a local path dependency (../bxp-core) in bxp-cli/build.zig.zon - no network fetch needed during development. sunrise is a URL dependency fetched by Zig's package manager on first build.

bxp-core modules

Module	File	Responsibility
`csv`	`csv.zig`	RFC 4180 parser. `splitRecords()` slices raw content; `splitFields()` unquotes fields. Intentional deviation: leading/trailing whitespace trimmed (broker exports pad fields).
`xlsx`	`xlsx.zig`	Converts `.xlsx` to intermediate `.csv`. Reads ZIP+XML, handles shared strings, formula results, dates (via `styles.xml` numFmtId). Max file size 10 MB.
`expr`	`expr.zig`	Expression evaluator. Recursive-descent parser → evaluator. Per-row `Context` holds field values, ticker map, lookup table. `eval()` returns `Value` (number/string/bool); `evalString()` coerces to string.
`config`	`config.zig`	Reads `bxp-cli.json` via `json5.zig` preprocessor then `std.json`. Returns `Config` owning all heap memory. `BrokerConfig.validate()` checks semantic constraints.
`json`	`json.zig`	Reads a JSON array-of-objects into a flat row representation. Builds a union of all keys across all objects; fills missing keys with empty string.
`json5`	`json5.zig`	Single-pass tokenizer that converts JSON5 → standard JSON. Strips comments, converts unquoted keys, removes trailing commas, normalizes single-quoted strings.

bxp-cli internals

main.zig - entry point:

Parses --config, --template, --data, --debug, --quiet, --fresh, --version flags.
Validates file paths (rejects shell metacharacters, limits ../ depth).
Loads and validates all templates in config (config.validate()).
Calls pipeline.xlsxPrePass() for any templates that reference .xlsx files.
Calls pipeline.processBroker() for each selected template.
Exits with code 0 (success), 1 (error), or 2 (warnings).

pipeline.zig - processing engine:

xlsxPrePass() - iterates all templates with xlsx_sheet defined, converts each .xlsx file to an intermediate .csv. Templates sharing the same data_dir share the extraction pass (each file extracted once).
processBroker() - the main processing loop (intentionally monolithic):
1. Reads input files (CSV, JSON, or intermediate CSV from xlsx pre-pass).
2. Runs pre_pass if defined: one full iteration over all rows building a lookup map.
3. Main loop: evaluates input_schema expressions, matches row_rules, renders output_schema to produce output rows.
4. Writes RFC 4180-compliant CSV to .csvx output files.
Output - thin wrapper around stdout that respects --quiet and --debug flags.
SectionStats - accumulates warning/error counts across templates.

Two-pass processing pipeline

Input file (CSV/XLSX/JSON)
        │
        ▼
[xlsx_prepass]  ← if xlsx_sheet defined  → intermediate .csv
        │
        ▼
[pre_pass]      ← optional: full scan    → lookup table (keyed by expression)
        │
        ▼
[main loop - per row]
  1. Evaluate input_schema   → $variables
  2. Match row_rules         → set $action (+ overrides)
  3. Render output_schema    → output row
  4. Write to .csvx

A single input row can produce 0, 1, or N output rows depending on row_rules.
rows: [] = silent skip
rows: [{...}, {...}] = two output rows from one input row.

Expression evaluator (expr.zig)

The evaluator is a hand-written recursive-descent parser. Operator precedence (high → low):

unary -  →  * /  →  & (concat)  →  + -  →  = != < > <= >=  →  AND  →  OR

How to add a new function: see Adding a new built-in function below.

Key types:

pub const Value = union(enum) {
    number: f64,
    string: []const u8,
    boolean: bool,
};

pub const Context = struct {
    fields: []const []const u8,                 // raw CSV field values for current row
    col_index: std.StringHashMap(usize),        // header name → field index
    ticker_map: std.StringHashMap([]const u8),
    lookup_table: ?*LookupTable,
    alloc: std.mem.Allocator,
    decimal_sep_in: u8,                         // '.' or ','
    quote_out: u8,                              // output quoting character
};

Type coercions:

Empty string → 0 in numeric context.
Any non-empty string → true in boolean context.
Numbers are formatted as strings: trailing .0 stripped ("99.00" → "99").

Configuration system (config.zig + json5.zig)

Config loading sequence:

bxp-cli.json  →  json5.preprocess()  →  std.json.parseFromSlice()  →  Config struct

json5.zig is a pure preprocessor - it only transforms text. The output is always valid JSON consumed by the standard library parser. This means the full JSON5 feature set (comments, trailing commas, unquoted keys, single-quoted strings) is supported at zero cost: no custom JSON parser needed.

Config owns all heap-allocated strings. Call cfg.deinit() to free everything. BrokerConfig (one per template) holds the parsed template fields, pre_pass config, input/output schemas, and row rules.

Memory model

Two arena allocators are used during processing:

Allocator	Lifetime	Owns
`file_alloc` (ArenaAllocator)	Reset after each input file	File content, parsed rows, expression results
`line_alloc` (ArenaAllocator)	Reset after each row	Per-row expression evaluation scratch space

The root GPA (std.heap.DebugAllocator) catches leaks in debug builds.

Adding a new conversion template

No code changes required. Add an entry to bxp-cli.json:

"broker_to_tracker": {
  "data_dir": "../data/broker_to_tracker",
  "file_pattern_in": ".csv",
  "ticker_map": {},
  "input_schema": {
    "$date":      "DATE_CONVERT([Date], 'DD/MM/YYYY', 'YYYY-MM-DD')",
    "$ticker":    "TICKER([Symbol])",
    "$quantity":  "[Quantity]",
    "$unitprice": "PRICE_VALUE([Price])",
    "$currency":  "PRICE_CURRENCY([Price])",
    "$fee":       "[Fee]",
    "$amount":    "[Total]"
  },
  "row_rules_debug_missing": true,                  // false if all rows handled
  "row_rules": [
    { "when": "[Type] = 'buy'",  "rows": [ { "$action": "'BUY'" } ] },
    { "when": "[Type] = 'sell'", "rows": [ { "$action": "'SELL'" } ] },
    { "when": "1",               "rows": [] }       // skip everything else
  ],
  "output_schema": {
    "date":         "$date",
    "symbol":       "$ticker",
    "quantity":     "$quantity",
    "activityType": "$action",
    "unitPrice":    "$unitprice",
    "currency":     "$currency",
    "fee":          "$fee",
    "amount":       "$amount"
  }
}

Tips:

Start with "row_rules_debug_missing": true and run with --debug to see which rows are not matched by any rule.
Use [ColumnName] to reference raw CSV columns by header name.
Use PRICE_VALUE / PRICE_CURRENCY for columns like "24.00 CZK" or "$100.50".
pre_pass is needed when values from one row are needed in another (e.g. AnyCoin pairs a trade payment row with a trade fill row via Order ID).

Adding a new built-in function

Define the function in bxp-core/src/expr.zig:
- Find the evalFunc() helper (called from the parser when a function name is recognized).
- Add a new if (std.mem.eql(u8, name, "MY_FUNC")) { ... } branch.
- Functions receive already-evaluated Value arguments.
- Return a Value or propagate an error.
Document it in the expression reference table in bxp-cli/CLAUDE.md (and resources/readme.md if user-facing).

Add unit tests inline in expr.zig:

test "MY_FUNC basic" {
    // ... uses std.testing.expectEqualStrings / expectApproxEqAbs
}

Run tests:

cd bxp-core && zig build test --summary all

Testing

# run test
./scripts/test.sh

# background process
├── bxp-core unit tests  (zig build test in bxp-core)
│   ├── csv.zig   - all test cases
│   ├── expr.zig  - all test cases
│   └── json5.zig - all test cases
└── bxp-cli regression tests
    └── datasets/<id>/  - diff output vs sample.expected

# ... output ...
Running unit tests (bxp-core)...

Building bxp-cli...

  [anycoin_to_wealthfolio]                         PASS
  [revolutx_to_wealthfolio]                        PASS
  [trading212_to_wealthfolio]                      PASS
  [xtb1_cash_to_wealthfolio]                       PASS
  [xtb1_closed_to_wealthfolio]                     PASS
  [xtb2_cash_to_wealthfolio]                       PASS
  [xtb2_closed_to_wealthfolio]                     PASS

Results: 7 passed, 0 failed
All tests passed.

Adding a regression test:
Place sample.csv (or .xlsx) + sample.expected + sample.json in datasets/<template_id>/.
The test script picks them up automatically.

Anonymizing test data:
Before committing .csv or .xlsx files in datasets/, strip real account or confidential informations .

Release process

# run release
./scripts/release.sh

The release script cross-compiles bxp-cli for selected targets:

Target	Output
`x86_64-linux-gnu`	`bxp-cli-linux-x86_64`
`aarch64-macos`	`bxp-cli-macos-aarch64`
`x86_64-windows`	`bxp-cli-windows-x86_64.exe`

Outputs are placed in releases/.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BXP - Developer Guide

Table of Contents

Part 1 - Getting Started

VS Code setup

Verify Zig language version

Claude Code setup

Skills to use

Repository layout

Clone and build

Run the test suite

Part 2 - Architecture and Internals

Design philosophy

Package dependency graph

bxp-core modules

bxp-cli internals

Two-pass processing pipeline

Expression evaluator (expr.zig)

Configuration system (config.zig + json5.zig)

Memory model

Adding a new conversion template

Adding a new built-in function

Testing

Release process

FilesExpand file tree

devel.md

Latest commit

History

devel.md

File metadata and controls

BXP - Developer Guide

Table of Contents

Part 1 - Getting Started

VS Code setup

Verify Zig language version

Claude Code setup

Skills to use

Repository layout

Clone and build

Run the test suite

Part 2 - Architecture and Internals

Design philosophy

Package dependency graph

bxp-core modules

bxp-cli internals

Two-pass processing pipeline

Expression evaluator (expr.zig)

Configuration system (config.zig + json5.zig)

Memory model

Adding a new conversion template

Adding a new built-in function

Testing

Release process