Case Study: SynthGL - 4-9x Faster GL Exports

Company

SynthGL generates synthetic accounting data for fintech QA teams and AI model training. Integration teams at companies like Plaid, Ramp, and Mercury use it to test accounting connectors against deterministic general ledger (GL) data - journal entries, trial balances, financial statements - without touching production systems.

The Workload

SynthGL exports GL journal entries to Excel workbooks. Each row is a 14-column journal line: transaction IDs, dates, account codes, debit/credit amounts, memos, and policy metadata. A typical export spans 5,000 to 50,000 rows.

These exports run continuously in CI pipelines, so Excel I/O speed directly shapes developer feedback loops.

Data profile: 14 columns per row (mixed types: strings, floats, dates, IDs). Real accounting data with realistic value distributions.

The Challenge

SynthGL relied on openpyxl for all Excel I/O. Below 1,000 rows, performance was fine. Larger test datasets exposed the bottleneck:

A 50,000-row GL export took 2.9s to write and 4.3s to read back for validation
The test suite generates dozens of bundles per run, making Excel I/O a significant share of total CI time
Reading a workbook to verify exported data took longer than generating the data itself

The Migration

SynthGL replaced openpyxl with WolfXL in one import change:

- from openpyxl import load_workbook, Workbook
+ from wolfxl import load_workbook, Workbook

For write-heavy paths, they adopted WolfXL's append() API, which buffers rows in Python and flushes them to Rust in a single call at save time:

from wolfxl import Workbook

wb = Workbook()
ws = wb.active
ws.title = "GL Journal"

ws.append(["GLID", "EntryID", "TxnDate", "AccountCode", "Debit", "Credit", ...])
for line in journal_lines:
    ws.append([line.line_id, line.entry_id, line.date, line.account_code, ...])

wb.save("gl_export.xlsx")

For reads, iter_rows(values_only=True) pulls all values in a single Rust call:

wb = load_workbook("gl_export.xlsx")
ws = wb[wb.sheetnames[0]]
for row in ws.iter_rows(values_only=True):
    validate(row)
wb.close()

The rest of the codebase worked unchanged. Cell access patterns like ws.cell() and ws["A1"].value kept working as before.

Results

Benchmarked on Apple M-series, Python 3.12, mean of 5 runs per scale. Data generated by SynthGL's BundleGenerator using the saas company profile.

Write Performance

Scale	Cells	openpyxl	WolfXL	Speedup
1,000 rows	14,000	0.070s	0.015s	4.7x
5,000 rows	70,000	0.278s	0.076s	3.7x
10,000 rows	140,000	0.614s	0.151s	4.1x
20,000 rows	280,000	1.219s	0.299s	4.1x
50,000 rows	700,000	2.967s	0.761s	3.9x

Read Performance

Scale	Cells	openpyxl	WolfXL	Speedup
1,000 rows	14,000	0.071s	0.010s	7.2x
5,000 rows	70,000	0.394s	0.049s	8.1x
10,000 rows	140,000	0.838s	0.102s	8.2x
20,000 rows	280,000	1.749s	0.205s	8.5x
50,000 rows	700,000	4.616s	0.515s	9.0x

Key Takeaways

Writes are 3.7-4.7x faster across all dataset sizes using append() or write_rows().
Reads get faster at scale: 7.2x at 1K rows, 9.0x at 50K. WolfXL's fixed Rust initialization cost amortizes across more data, so larger files benefit most.
One import change, no rewrite: Batch APIs (append, write_rows, iter_rows(values_only=True)) delivered additional gains with minimal refactoring.
50K-row exports dropped from 7.6s to 1.3s (write + read combined), cutting CI cycle time directly.

Reproduce These Results

The benchmark script is open source:

pip install wolfxl openpyxl

# Clone SynthGL and run the benchmark
git clone https://github.com/SynthGL/synthgl.git
cd synthgl
uv sync --extra dev
uv run python3 scripts/benchmark_wolfxl.py \
    --scales 1000,5000,10000,20000,50000 \
    --rounds 5 \
    --output results/wolfxl_benchmark.json

Or benchmark WolfXL against any Excel file you already have:

pip install wolfxl
wolfxl bench your_file.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Case Study: SynthGL - 4-9x Faster GL Exports

Company

The Workload

The Challenge

The Migration

Results

Write Performance

Read Performance

Key Takeaways

Reproduce These Results

FilesExpand file tree

case-study-synthgl.md

Latest commit

History

case-study-synthgl.md

File metadata and controls

Case Study: SynthGL - 4-9x Faster GL Exports

Company

The Workload

The Challenge

The Migration

Results

Write Performance

Read Performance

Key Takeaways

Reproduce These Results