Skip to content

Releases: sebastienrousseau/bankstatementparser

v0.0.8 — Full Platform

11 Apr 17:43
v0.0.8
b950a76

Choose a tag to compare

Full Platform. Closes every gap identified in the competitive analysis. Five new features: multi-currency balance verification, hledger/beancount export, bulk directory scanner, account mapping rules, and a REST API microservice.

New features

1. Multi-currency balance verification

from bankstatementparser.hybrid import verify_balance_multi_currency

results = verify_balance_multi_currency(transactions, balances={
    "GBP": (Decimal("500"), Decimal("570")),
    "EUR": (Decimal("1000"), Decimal("1150")),
})

Groups by Transaction.currency, runs the Golden Rule independently per group. No more false DISCREPANCY on multi-currency statements.

2. hledger + beancount export

from bankstatementparser.export import to_hledger, to_beancount

Path("journal.ledger").write_text(to_hledger(transactions))
Path("journal.beancount").write_text(to_beancount(transactions))

Uses Transaction.category as the contra-account when set by the enrichment module. Zero external dependencies.

3. Bulk directory scanner

from bankstatementparser.hybrid import scan_and_ingest

batch = scan_and_ingest("statements/", pattern="**/*.pdf")
print(f"{batch.file_count} files, {batch.total_unique} unique transactions")

Scans a folder tree, runs smart_ingest on every match, deduplicates across the entire batch. Supports seen_hashes for cross-batch persistence.

4. Account mapping rules

from bankstatementparser.enrichment import AccountMapper

mapper = AccountMapper.from_json("mapping.json")
accounts = mapper.map_batch(transactions)

Ordered regex rules, first match wins, loaded from JSON config. Pairs with the ledger exporter for end-to-end plaintext-accounting workflows.

5. REST API

pip install 'bankstatementparser[api]'
bankstatementparser-api --port 8000

# POST a file, get JSON back
curl -F file=@statement.pdf http://localhost:8000/ingest
curl http://localhost:8000/health

FastAPI microservice with /ingest and /health. Default bind 127.0.0.1 (safe); use --host 0.0.0.0 for containers. Gated behind [api] extra.

Install

pip install bankstatementparser                          # core
pip install 'bankstatementparser[hybrid]'                # + text-LLM
pip install 'bankstatementparser[hybrid-vision]'         # + vision
pip install 'bankstatementparser[enrichment]'            # + categorization
pip install 'bankstatementparser[api]'                   # + REST API

Test plan

  • 723 tests at 100% line + branch coverage
  • mypy --strict clean on 29 source files
  • ruff check + bandit -r clean
  • Python 3.14 asyncio compatibility fix included
  • 44 CI checks pass

Full changelog: CHANGELOG.md

Pull request: #52

v0.0.7 — Universal Vision

11 Apr 14:01
v0.0.7
c86f30e

Choose a tag to compare

Universal Vision. Turns the local Ollama vision path from 🔴 (600 s LiteLLM timeout, hallucinated output) to 🟢 (all 11 rows extracted in ~33 s, correct currency and balances). Three independent improvements, all verified end-to-end against real local Ollama models on Apple Silicon.

What's new

1. Direct Ollama bridge — bankstatementparser.hybrid.ollama_direct

# Auto-selected for any ollama/* model — zero opt-in needed
from bankstatementparser.hybrid import smart_ingest
result = smart_ingest("scan.pdf")  # just works, ~33s instead of 600s timeout

A ~220-line drop-in replacement for litellm.completion that targets Ollama's /api/chat endpoint via httpx. Sidesteps the upstream LiteLLM ↔ Ollama integration bug where vision calls with long structured-JSON system prompts hang at the 600 s timeout.

  • ollama_direct_completion(**kwargs) — accepts OpenAI-style messages (including multimodal image_url blocks), returns OpenAI-style response envelope
  • is_ollama_model(model) — returns True for ollama/<name> or ollama_chat/<name>
  • Auto-selection in both VisionExtractor and LLMExtractor — no user action required
  • No new dependencies — httpx is already a transitive dep of LiteLLM in [hybrid]

2. ollama/minicpm-v recommended default

ollama pull minicpm-v
export BSP_HYBRID_VISION_MODEL=ollama/minicpm-v

minicpm-v:8b (5.5 GB) is explicitly trained for OCR and document understanding. Replaces ollama/llava:7b which was a general-purpose multimodal model not designed for dense statement tables.

Model Result on synthetic scanned PDF
ollama/llava:7b 🔴 Hallucinates INR currency, fabricated rows
ollama/minicpm-v:8b 🟢 All 11 transactions, GBP, balances correct, ~33 s

3. Strip mode — VisionExtractor(strip_rows=True)

from bankstatementparser.hybrid import VisionExtractor, smart_ingest

vision = VisionExtractor(strip_rows=True, n_strips=4)
result = smart_ingest("dense_statement.pdf", vision_extractor=vision)

Splits each page into N overlapping horizontal strips (default 4, 10% overlap). Header strip extracts balances; body strips extract transactions; results merged by transaction_hash. Designed for dense pages (≥15 rows) where small local models can't process the full page — CLIP's 336×336 internal downscale destroys fine table detail on a full A4 page, but preserves it on a strip.

Smoke-test results

Path Model Mode Result
Text-LLM ollama/llama3 single-shot ✅ All 11 rows, VERIFIED, ~25 s
Vision-LLM ollama/minicpm-v:8b single-shot ✅ All 11 rows, GBP, ~33 s
Vision-LLM ollama/minicpm-v:8b strip_rows=True ✅ Sign convention correct, ~43 s

Install

pip install 'bankstatementparser[hybrid-vision]'

Migration from v0.0.6

Fully backwards compatible. Existing code keeps working — it just runs faster. Three opt-in upgrade patterns:

# 1. Do nothing — auto-bridge activates for ollama/* models
result = smart_ingest("scan.pdf")

# 2. Switch to minicpm-v
os.environ["BSP_HYBRID_VISION_MODEL"] = "ollama/minicpm-v"

# 3. Enable strip mode for dense pages
vision = VisionExtractor(strip_rows=True, n_strips=4)
result = smart_ingest("dense.pdf", vision_extractor=vision)

Test plan

  • 677 tests at 100% line + branch coverage (up from 649 on v0.0.6)
  • mypy --strict clean on 24 source files
  • ruff check + bandit -r clean
  • 32 docs accuracy tests all pass
  • All examples verified end-to-end
  • 44 CI checks pass

Full changelog

See CHANGELOG.md for the complete v0.0.7 entry.

Pull request: #51 (8 commits, all SSH-signed)

v0.0.6 — Intelligence Layer

10 Apr 23:15
v0.0.6
86431e1

Choose a tag to compare

Intelligence Layer. The full v0.0.6 milestone. Drops Python 3.9 to retire the entire transitive CVE allow-list, adds a categorization enrichment module, an interactive review mode for discrepancy resolution, per-row bounding-box extraction from the vision pipeline, a pre-commit hook, and a 32-test automated docs accuracy suite. Closes #44, #45, #46, #47.

What's new

Categorization module (#44) — bankstatementparser.enrichment

from bankstatementparser.enrichment import Categorizer

cat = Categorizer()  # default: Plaid 13-category schema
enriched = cat.categorize_batch(transactions)
for et in enriched:
    print(et.transaction.description, "->", et.category, et.is_business_expense)
  • Categorizer — LiteLLM-backed with pluggable schema, batch support, graceful failure (no data loss on LLM errors), schema-normalizing category matching
  • EnrichedTransaction — wrapper (not mutator) around Transaction carrying category, is_business_expense, enrichment_confidence, and rationale
  • DEFAULT_CATEGORY_SCHEMA — Plaid's 13-category taxonomy as the default
  • [enrichment] install extra (pip install 'bankstatementparser[enrichment]')
  • Prompt injection defense: _sanitize_for_prompt() strips control characters and common injection markers from transaction descriptions before LLM interpolation

Interactive review mode (#45) — --type review

# 1. Ingest and save
bankstatementparser --type ingest --input statement.pdf --output result.json

# 2. Walk through discrepancies
bankstatementparser --type review --input result.json --output reviewed.json
  • IngestResult.to_json() / .from_json() — stable JSON round-trip with schema_version=1, Decimal amounts as strings (no float drift), embedded audit_trail
  • --type review CLI — single-character action menu per row: [a]ccept / [e]dit / [s]kip / [d]elete / [q]uit. Every action recorded in the audit trail. Edits capture before_hash / after_hash. Non-curses (plain stdin/stdout).
  • JSON size guard — rejects payloads > 50 MB before parsing

Per-row bounding boxes (#46) — BoundingBox + Transaction.source_bbox

for tx in result.transactions:
    if tx.source_bbox:
        print(f"Row at ({tx.source_bbox.x0:.2f}, {tx.source_bbox.y0:.2f})")
  • BoundingBox Pydantic model with normalized (0.0–1.0) coordinates and page_index, exported from the top-level package
  • Transaction.source_bbox — populated by the vision path when the model returns spatial coordinates
  • Inverted-box validationmodel_validator rejects x0 > x1 or y0 > y1
  • Vision prompt updated to request per-row bounding boxes in the JSON schema

Python 3.9 retirement (#47)

  • Minimum Python bumped to 3.10 (Python 3.9 reached EOL 2025-10-31)
  • All 9 transitive CVE allow-list entries deleted — every vulnerable package now resolves to its patched series:
Package v0.0.5 v0.0.6 Advisories closed
litellm 1.80.0 1.83.4 GHSA-jjhc-v7c2-5hh6, GHSA-53mr-6c8q-9789, GHSA-69x8-hrgq-fjj8
cryptography 43.0.3 46.0.7 GHSA-r6ph-v2qm-q3c2, GHSA-79v4-65xg-pq4g, GHSA-m959-cc7f-wv43
pillow 11.3.0 12.2.0 GHSA-cfh3-3jmp-rvhc
filelock 3.19.1 3.25.2 GHSA-w853-jp5j-5j7f, GHSA-qmgc-5h2g-mvrw
requests 2.32.5 2.33.1 GHSA-gc5v-m9x4-r6x2

Security hardening

  • Prompt injection defense in enrichment categorizer (_sanitize_for_prompt)
  • JSON deserialization size guard (50 MB cap in IngestResult.from_json)
  • Frozen-dataclass immutability fixIngestResult fields changed from list to tuple
  • BoundingBox inverted-box validation via Pydantic model_validator
  • Duplicate-index warning when the LLM returns the same row index twice

Developer experience

  • Pre-commit hook (.githooks/pre-commit) runs make verify (ruff + mypy + pytest + bandit) before every commit. Setup: make install-hooks
  • Automated docs accuracy test suite (test_docs_accuracy.py, 32 tests) validates every factual claim in README, FAQ, CHANGELOG, CONTRIBUTING, and SECURITY against the actual codebase
  • Modernised Makefile with install, install-all, install-hooks, test, lint, typecheck, security, verify, dist, release targets
  • PowerShell CLI walkthrough (06_cli_walkthrough.ps1) for native Windows users

Install

pip install bankstatementparser                          # core (deterministic parsers)
pip install 'bankstatementparser[hybrid]'                # + text-LLM for digital PDFs
pip install 'bankstatementparser[hybrid-vision]'         # + vision for scanned PDFs
pip install 'bankstatementparser[enrichment]'            # + categorization

Migration from v0.0.5

The public API is unchanged. v0.0.5 code runs on v0.0.6 without modification provided the interpreter is Python 3.10+. If you are on Python 3.9, pin to v0.0.5:

bankstatementparser==0.0.5

Test plan

  • 649 tests at 100% line + branch coverage (up from 541 on v0.0.5)
  • mypy --strict clean on 23 source files
  • ruff check + bandit -r clean
  • 44 CI checks pass on Python 3.10–3.14
  • All hybrid examples verified end-to-end
  • Deep-dive security + correctness audit completed with all findings fixed

Full changelog

See CHANGELOG.md for the complete v0.0.6 entry.

Pull request: #48 (15 commits, all SSH-signed)

v0.0.5 — Universal Extraction

08 Apr 13:35
v0.0.5
c67f507

Choose a tag to compare

Universal Extraction. Combines the deterministic reliability of the existing ISO/exchange-format parsers with an adaptive LLM layer for unstandardized PDFs, including a multimodal vision fallback for scanned/image-only statements. The core "data only, no inference" philosophy of the library is preserved — categorization and review-mode UI are intentionally deferred to v0.0.6.

Three extraction paths via smart_ingest()

Path Trigger Cost Module
A — Deterministic detect_statement_format() returns a non-PDF format $0, fastest existing parsers
B — Text-LLM PDF with ≥ 50 chars extractable text tokens hybrid/llm_extractor.py
C — Vision-LLM PDF below LOW_TEXT_DENSITY_THRESHOLD (scan/photo) tokens + compute hybrid/vision.py

IngestResult.source_method is tagged with "deterministic" | "llm" | "vision" for full audit provenance on every row.

from bankstatementparser.hybrid import smart_ingest

result = smart_ingest("statement.pdf")
print(result.source_method)        # "deterministic" | "llm" | "vision"
print(result.verification.status)  # VERIFIED | DISCREPANCY | FAILED
for tx in result.transactions:
    print(tx.transaction_hash, tx.amount, tx.description)

Install

# Core install — deterministic parsers only (zero AI dependencies)
pip install bankstatementparser

# Add the text-LLM path for digital PDFs
pip install 'bankstatementparser[hybrid]'

# Add higher-fidelity table extraction (adds pdfplumber)
pip install 'bankstatementparser[hybrid-plus]'

# Add the multimodal vision path for scanned/photocopied PDFs
pip install 'bankstatementparser[hybrid-vision]'

Every [hybrid*] extra is opt-in and pure-Python — no poppler, no system libraries, no GPU required. Works identically on macOS, Linux, and WSL.

Highlights

New bankstatementparser.hybrid subpackage

  • smart_ingest() — single entry point that implements the three-path routing above. Auto-routes to vision when pypdf extracts fewer than LOW_TEXT_DENSITY_THRESHOLD (50) characters.
  • LLMExtractor — LiteLLM-backed text extractor with provider-agnostic configuration via BSP_HYBRID_MODEL. Default model is ollama/llama3 (local, private). Tolerant JSON parsing handles markdown fences and prose wrappers.
  • VisionExtractor — multimodal extractor for scanned/image-only PDFs. Renders pages with pypdfium2 (pure-Python wheel, no poppler dependency) and sends base64 PNGs via LiteLLM's multimodal payload. Vision model is opt-in only via BSP_HYBRID_VISION_MODEL — no surprise downloads.
  • verify_balance() — Golden Rule integrity check returning VERIFIED | DISCREPANCY | FAILED with the exact delta when mismatched.
  • Structured prompts that explicitly instruct the model to sort transactions chronologically, mitigating PDF reading-order issues.

Transaction model upgrades

  • transaction_hash — computed field, MD5 of date | normalized_description | amount. Every row carries an immutable fingerprint for idempotent re-ingestion.
  • source_methodLiteral["deterministic", "llm"], audit provenance per row.
  • confidenceOptional[float], populated for LLM rows.
  • category and raw_source_text — reserved placeholders for the v0.0.6 "Intelligence Layer" release.

normalize_description() noise stripping

Strips inline dates (2026-04-01), times (12:49), and long alphanumeric IDs so that recurring charges hash identically. AMZN MKTPLACE 2026-04-01 #A1B2C3 and AMZN MKTPLACE 2026-04-02 #Z9Y8X7 collapse to the same normalized form, which means dedupe_by_hash() actually catches real duplicates instead of being defeated by one rotating reference character.

Deduplicator.dedupe_by_hash()

New strict identity filter using Transaction.transaction_hash, designed for incremental ingestion (syncing to Google Sheets / a database). Mutates a caller-owned seen_hashes: set[str] so consumers can persist state across batches. Coexists with the existing fuzzy/temporal deduplicate() method.

CLI

bankstatementparser --type ingest --input statement.pdf [--output ledger.csv]

New bankstatementparser console-script entry point. Both forms work in parallel:

bankstatementparser --type ingest --input file.pdf
python -m bankstatementparser.cli --type ingest --input file.pdf

Graceful degradation when the [hybrid] extra is missing — surfaces the specific missing dependency name and prints a pip install hint.

Examples — examples/hybrid/

Eight new files including a Mermaid flow diagram, prerequisites table, 15-minute quick start, mock-vs-live mode comparison, cross-platform verification matrix, and troubleshooting table. generate_sample_pdfs.py produces reproducible synthetic UK-bank PDFs (digital + scanned) so the LLM examples are runnable without real bank PDFs. Each LLM example runs in two modes — MOCK (default, fully offline, CI-safe) and LIVE (set BSP_HYBRID_MODEL / BSP_HYBRID_VISION_MODEL).

See examples/hybrid/README.md for the full walkthrough.

Smoke-test results (real Ollama models, Apple Silicon, 2026-04-08)

Path Model Result
A — Deterministic n/a ✅ CAMT.053 fixture, 3 transactions, all hashes computed
B — Text-LLM ollama/llama3 (4.7 GB) ✅ All 11 transactions extracted with confidence=1.00, balance VERIFIED, ~25s end-to-end
C — Vision-LLM ollama/llava:7b (4.7 GB) ⚠️ Library code verified correct, but blocked by reproducible upstream LiteLLM ↔ Ollama hang on long system prompts. Direct Ollama call works in 18s but llava-7b hallucinates statement contents at any render scale. Recommended production path: hosted vision models (gpt-4o, claude-opus-4-6, gemini-2.5-pro).
Golden Rule n/a ✅ All three outcomes (VERIFIED, DISCREPANCY, FAILED) reproduce as documented
Dedupe n/a ✅ Recurring Amazon dup caught in batch 1, both already-seen rows caught in batch 2
CLI --type ingest n/a ✅ Deterministic path produces expected DataFrame with all v0.0.5 columns

Test plan

  • 541 tests pass (up from 484 on v0.0.4)
  • 100% line + branch coverage across the entire package, including the new hybrid subpackage
  • mypy --strict clean on 21 source files
  • ruff check clean on bankstatementparser/, tests/, and examples/
  • bandit -r clean
  • All optional dependencies monkeypatched in tests — CI does not require any [hybrid*] extra to be installed
  • 48 CI checks green on the merge commit

Security

Allow-listed nine transitive CVEs across litellm (3), cryptography (3), pillow (1), filelock (2), and requests (1). All nine share the same root cause: their patched versions require Python ≥ 3.10, while this release still supports Python 3.9. Each advisory is documented per-CVE with the reason its vulnerable code path is unreachable from anything we ship. The entire allow-list can be deleted in a single commit when the minimum Python is raised — see the strategic note in the v0.0.5 commit history.

Deferred to v0.0.6 — "Intelligence Layer"

  • Categorization (category field populated, is_business_expense flag) — will ship as opt-in bankstatementparser.enrichment module
  • Interactive review mode — separate --type review subcommand consuming saved IngestResult JSON
  • OCR chunk-to-row mapping — true bounding-box mapping from the vision path
  • Drop Python 3.9 support — Python 3.9 reached EOL on 2025-10-31

Full changelog

See CHANGELOG.md for the complete v0.0.5 entry.

Pull request: #43 (13 commits, all SSH-signed)

v0.0.4 — 27K tx/s streaming, parallel parsing, Python 3.14, ISO 13485

01 Apr 07:40
v0.0.4
92f2128

Choose a tag to compare

Performance

Metric CAMT PAIN.001
Throughput 27,000+ tx/s 52,000+ tx/s
Per-transaction latency 37 us 19 us
Time to first result < 1 ms < 2 ms
Memory scaling Constant (1K–50K) Constant (1K–50K)
  • 20% CAMT streaming optimization (xpath → find/findtext)
  • True streaming for PAIN.001 files > 50 MB via chunk-based temp file
  • CI-enforced TPS minimums and latency contracts

New Features

  • parse_files_parallel() — Process multiple statement files across CPU cores using ProcessPoolExecutor
  • Deduplicator — Deterministic transaction deduplication with explainable confidence scores
  • Transaction — Pydantic model normalizing records from any parser with Decimal precision
  • to_polars() / to_polars_lazy() — Optional Polars DataFrame export (pip install bankstatementparser[polars])
  • Python 3.13 and 3.14 — Full support with CI matrix testing

Dependencies

Package Change
lxml 4.9.3 → 6.0.2
Pygments 2.19.2 → 2.20.0 (CVE-2026-4539 fix)
pydantic Added (^2.11.0)
hypothesis Added (>=6.82,<7)
polars Added (^1.32.0, optional)

Documentation

  • FAQ.md — 11 questions across 3 personas (CFO/Auditor, Fintech Dev, Treasury Analyst)
  • docs/MAPPING.md — Complete XML tag to DataFrame column mapping for all 6 formats
  • README — Performance table, parallel parsing, deduplication, PII redaction, output examples

ISO 13485 Compliance Suite

  • Risk Register — 7 quantified hazards with severity/probability scoring and residual risk
  • V&V Plan — 5-phase, 19-step with pass criteria and evidence retention
  • Change Control Procedure — Change workflow, impact assessment, rollback
  • SOUP Register — 22 tracked components with risk levels and EOL
  • Traceability Matrix — 17 design inputs mapped to implementation and verification
  • Secure Path to Production — Gate criteria per stage with approval authority
  • Security Policy — Response SLAs (48h ack, 30d fix), severity classification

Quality

Metric Value
Tests 467 passed, 0 skipped
Branch coverage 100%
Modules 13
Bandit SAST 0 findings
pip-audit 0 CVEs
Commits All signed (ED25519)
SOUP components 22
Design inputs 17

Breaking Changes

None. All existing APIs are backward-compatible.


THE ARCHITECT ᛫ Sebastien Rousseau ᛫ https://sebastienrousseau.com
THE ENGINE ᛞ EUXIS ᛫ Enterprise Unified Execution Intelligence System ᛫ https://euxis.co

v0.0.3

23 Mar 00:47
v0.0.3
666fd79

Choose a tag to compare

What's Changed

Full Changelog: v0.0.2...v0.0.3

v0.0.2

22 Mar 21:14
v0.0.2
885e553

Choose a tag to compare

Highlights

  • Add secure in-memory CAMT parsing with CamtParser.from_string(...) and CamtParser.from_bytes(...)
  • Add hardened ZIP processing for XML statements via iter_secure_xml_entries(...)
  • Add parser support for bank CSV, OFX/QFX, and MT940 formats
  • Add automatic statement-format detection with detect_statement_format(...) and create_parser(...)
  • Add CI, security scanning, SBOM, checksum, and provenance hardening
  • Refresh docs, examples, contribution guidance, and cross-platform behavior

Verification

  • PR checks were green before merge
  • Release Integrity workflow for tag v0.0.2 passed successfully on 2026-03-22
  • Attached artifacts include the wheel, sdist, SHA256 checksums, SBOM, and dependency report

Release v0.0.1

08 Nov 21:28
36742ae

Choose a tag to compare

Release v0.0.1 - 2023-11-08

Logo of Bank Statement Parser

Bank Statement Parser v0.0.1 🐍

Bank Statement Parser banner

The Bank Statement Parser is a Python library built for Finance and Treasury Professionals

The Bank Statement Parser is an essential Python library for financial data management. Developed for the busy finance and treasury professional, it simplifies the task of parsing bank statements.

This tool simplifies the process of analysing CAMT and SEPA transaction files. Its streamlined design removes cumbersome manual data review and provides you with a concise, accurate report to facilitate further analysis.

Bank Statement Parser helps you save time by quickly and accurately processing data, allowing you to focus on your financial insights and decisions. Its reliable precision is powered by Python, making it the smarter, more efficient way to manage bank statements.

Key Features

  • Versatile Parsing: Easily handle formats like CAMT (ISO 20022) and beyond.
  • Financial Insights: Unlock detailed analysis with powerful calculation utilities.
  • Simple CLI: Automate and integrate with a straightforward command-line interface.

Why Choose the Bank Statement Parser

  • Designed for Finance: Tailored features for the finance sector's needs.
  • Efficiency at Heart: Transform complex data tasks into simple ones.
  • Community First: Built and enhanced by experts, for experts.

Functionality

  • CamtParser: Parse CAMT format files with ease.
  • Pain001Parser: Handle SEPA PAIN.001 files effortlessly.

Installation

Create a Virtual Environment

We recommend creating a virtual environment to install the Bank Statement Parser. This will ensure that the package is installed in an isolated environment and will not affect other projects.

python3 -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Getting Started

Install bankstatementparser with just one command:

pip install bankstatementparser

Usage

CAMT Files

from bankstatementparser import CamtParser

# Initialize the parser with the CAMT file path
camt_parser = CamtParser('path/to/camt/file.xml')

# Parse the file and get the results
results = camt_parser.parse()

PAIN.001 Files

from bankstatementparser import Pain001Parser

# Initialize the parser with the PAIN.001 file path
pain_parser = Pain001Parser('path/to/pain/file.xml')

# Parse the file and get the results
results = pain_parser.parse()

Command Line Interface (CLI) Guide

Leverage the CLI for quick parsing tasks:

Basic Command

python cli.py --type <file_type> --input <input_file> [--output <output_file>]
  • --type: Type of the bank statement file. Currently supported types are "camt" and "pain001".
  • --input: Path to the bank statement file.
  • --output: (Optional) Path to save the parsed data. If not provided, data is printed to the console.

Changelog

Artifacts 🎁