Summary
There is a gap between what CI validates and what users actually run.
The current integration test (pytest -m integration) runs the full extraction pipeline end-to-end in Python, verifying transaction counts, file outputs, and processing summary metrics against a committed snapshot. It is intentionally excluded from the default test run and must be triggered explicitly.
The gap: CI only smoke-tests the Docker image with a Python import check (import bankstatements_free; import bankstatements_core). It never runs the actual processing pipeline inside the container against real PDFs. This means a Docker-specific regression — a broken entrypoint, a missing volume mount, a wrong env var default, or a config that behaves differently inside the container — would pass CI and only be caught by a developer running make docker-local by hand.
What is currently tested
| Layer |
What is tested |
Where |
| Unit |
Individual services and classes |
pytest (default, 1395 tests) |
| Integration (Python) |
Full pipeline against real PDFs, snapshot comparison |
pytest -m integration (manual only) |
| Docker (CI) |
Image builds, Python imports succeed |
ci.yml build-docker job |
| Docker (pipeline) |
Not tested |
— |
What the gap looks like
A developer making a change to entrypoint.sh, docker-compose.yml, or an env-var default could:
- Break the volume mount for
input/ or output/
- Set a default that silently changes filter or sort behaviour inside the container
- Introduce a startup error that the import smoke-test does not catch
None of these would fail the current CI pipeline.
Proposed solution
Add a docker-integration CI job (runs after build-docker) that:
- Mounts a small set of test PDFs from
packages/parser-core/tests/integration/fixtures/ (or a dedicated tests/docker/input/ directory) into the container
- Runs the container to process them
- Asserts the output directory contains the expected files and non-zero transaction counts
This mirrors the existing Python integration test but exercises the real Docker entrypoint, volume mounts, and env-var handling.
Minimal CI step (sketch)
- name: Run Docker integration test
run: |
mkdir -p /tmp/docker-test/input /tmp/docker-test/output
cp packages/parser-core/tests/integration/fixtures/*.pdf /tmp/docker-test/input/
docker run --rm -v /tmp/docker-test/input:/app/input:ro -v /tmp/docker-test/output:/app/output bankstatementsprocessor:pr-${{ github.event.pull_request.number }}
ls /tmp/docker-test/output/*.csv || (echo "No CSV output produced" && exit 1)
python3 -c "
import json, glob, sys
files = glob.glob('/tmp/docker-test/output/*.json')
total = sum(len(json.load(open(f))) for f in files if not f.endswith('_summary.json'))
print(f'Transactions: {total}')
sys.exit(0 if total > 0 else 1)
"
Local equivalent (for developers)
# 1. Build the image
make docker-build
# 2. Run against the test fixtures
docker run --rm -v $(pwd)/packages/parser-core/tests/integration/fixtures:/app/input:ro -v /tmp/docker-output:/app/output bankstatementsprocessor:latest
# 3. Inspect output
ls /tmp/docker-output/
Value to developers
- Catches entrypoint regressions — a broken
CMD or missing PYTHONPATH shows up immediately
- Validates volume mount contract — confirms
/app/input and /app/output work as documented
- Exercises env-var defaults —
RECURSIVE_SCAN, COLUMN_NAMES, TABLE_TOP_Y etc. are tested in their default state
- Closes the loop between unit tests and what ships — the Docker image is what users actually run; this test validates that layer
Acceptance criteria
Summary
There is a gap between what CI validates and what users actually run.
The current integration test (
pytest -m integration) runs the full extraction pipeline end-to-end in Python, verifying transaction counts, file outputs, and processing summary metrics against a committed snapshot. It is intentionally excluded from the default test run and must be triggered explicitly.The gap: CI only smoke-tests the Docker image with a Python import check (
import bankstatements_free; import bankstatements_core). It never runs the actual processing pipeline inside the container against real PDFs. This means a Docker-specific regression — a broken entrypoint, a missing volume mount, a wrong env var default, or a config that behaves differently inside the container — would pass CI and only be caught by a developer runningmake docker-localby hand.What is currently tested
pytest(default, 1395 tests)pytest -m integration(manual only)ci.ymlbuild-dockerjobWhat the gap looks like
A developer making a change to
entrypoint.sh,docker-compose.yml, or an env-var default could:input/oroutput/None of these would fail the current CI pipeline.
Proposed solution
Add a
docker-integrationCI job (runs afterbuild-docker) that:packages/parser-core/tests/integration/fixtures/(or a dedicatedtests/docker/input/directory) into the containerThis mirrors the existing Python integration test but exercises the real Docker entrypoint, volume mounts, and env-var handling.
Minimal CI step (sketch)
Local equivalent (for developers)
Value to developers
CMDor missingPYTHONPATHshows up immediately/app/inputand/app/outputwork as documentedRECURSIVE_SCAN,COLUMN_NAMES,TABLE_TOP_Yetc. are tested in their default stateAcceptance criteria
Dockerfile,entrypoint.sh,docker-compose.yml, orpackages/parser-core/make docker-integrationtarget added for local useci-gaterequired checks