Manually-triggered GitHub Action that compares analysis results from two opentaint revisions across a fixed set of benchmark projects.
gh workflow run regression.yaml \
--field base_ref=main \
--field new_ref=my-feature-branch \
--field compare_locations=true \
--field compare_columns=false
Inputs:
| Input | Default | Description |
|---|---|---|
base_ref |
— | Opentaint ref (branch/tag/SHA) for the baseline. |
new_ref |
— | Opentaint ref whose results are compared vs base. |
compare_locations |
true |
If false, findings are matched by ruleId only. |
compare_columns |
false |
If true, column coordinates join the finding key. |
projects_filter |
"" |
Comma-separated substrings; restricts projects. |
max_parallel |
8 |
Upper bound on concurrent analyze jobs. |
The workflow summary ($GITHUB_STEP_SUMMARY) shows, per project: analyzer status for base and new (complete / incomplete / oom / analysis_timeout / high_memory), added and removed finding counts, scan time and peak memory for the new run (each with its delta against base, e.g. 412s (+282s) / 27.6G (+15.7G)), and a per-project verdict. Scan time is the wall-clock of the scan phase; peak memory is the highest resident usage the analyzer logged — rendered as <no data: base|new|both> when a run finished or failed before logging a memory sample. A project fails when:
- the analyzer regressed from
completeon base toincompleteon new, or - added/removed finding counts are non-zero, or
- the scan errored on either side.
Full diff detail is available in the regression-diff artifact.
| Path | Purpose |
|---|---|
.github/workflows/regression.yaml |
Workflow: resolve → probe → build → analyze → compare. |
projects/repos.yaml |
Benchmark project list (name, git URL, pinned head, etc.). |
projects/extensions/ |
Files (passthroughs, approximations, custom rules…) referenced by per-project scan-flags. |
scripts/build_opentaint.sh |
Build analyzer + autobuilder JARs and Go CLI from a checkout. |
scripts/generate_matrix.py |
Expand repos.yaml into a GH Actions matrix. |
scripts/run_analysis.py |
Run opentaint compile + scan, extract analyzer status. |
scripts/compare_sarif.py |
SARIF diff + status regression + verdict + markdown report. |
scripts/cache_key.py |
Canonical per-project cache key. |
tests/ |
Unit tests for pure-Python logic. Run python -m pytest tests. |
test-system-design-plan.md |
Design document (authoritative spec). |
Each entry requires name, git, and head (a pinned commit/tag SHA). These
optional fields tune one project:
| Field | Default | Purpose |
|---|---|---|
java-version |
17 |
JDK the project is compiled against (actions/setup-java). |
max-memory |
8G |
Analyzer scan memory ceiling. |
compilation-timeout |
1200 |
Wall-clock seconds for the autobuilder compile step (and, less the usual margin, the scan). Raise for large reactors that pull from slow mirrors (e.g. ruoyi-vue-pro, yudao-cloud). |
The autobuilder's Maven invocation runs with download-retry/timeout system
properties (run_analysis.py:maven_resilient_env) so a transient mirror hiccup
(e.g. an HTTP 502 from a project-pinned mirror) is retried rather than failing
the job. A project whose build is deterministically broken at its pinned head
(e.g. an inconsistent dev SNAPSHOT, or a non-Java module that fails to build) is
commented out with a QUARANTINED note explaining the cause.
Per-project results (SARIF + status.json + analyzer log) are cached in GitHub
Actions' built-in cache, keyed by:
sarif-v1-<analyzer_sha>-<test_system_sha>-<project_name>-<project_head>
The test-system SHA (= this repo's commit) is part of the key, so any change to scripts, workflow, or project list automatically invalidates cached results. Both successful and failed runs are cached; a subsequent session at a different analyzer or test-system SHA forces a re-run.
The workflow's probe job restores the cache before any build runs — if every
project has a hit for a given opentaint ref, the corresponding build job is
skipped entirely.
cd new-test
python -m pytest tests -v
Each entry in projects/repos.yaml may declare a scan-flags list whose
tokens are appended verbatim to the opentaint scan invocation. Use the
literal substring {ext} to reference files shipped in projects/extensions/
— the runner substitutes it with that directory's absolute path. Since the
substitution is plain string replacement, the resolved path may point at
either a file or a directory — whichever the underlying flag accepts
(e.g. --passthrough-approximations and --dataflow-approximations each
take a single file or a whole directory, and may be repeated):
- name: spring-petclinic
git: https://github.com/spring-projects/spring-petclinic.git
head: 3e1ce239f4488f20abda24441388a515ea55a815
scan-flags:
- --passthrough-approximations # single YAML file
- "{ext}/spring-petclinic/passthroughs.yaml"
- --passthrough-approximations # …or repeat with a directory
- "{ext}/spring-petclinic/passthroughs"
- --dataflow-approximations # directory of approximations
- "{ext}/spring-petclinic/approximations"
- --rule-id
- java.taint.sql-injectionFlags reserved by the runner (--analyzer-jar, --project-model,
--output, --timeout, --max-memory, --debug, --experimental) must
not be repeated here. --ruleset is not reserved — if scan-flags
contains no --ruleset, the runner inserts a default pointing at the
staged source-tree rule pack at <build>/rules (copied from
opentaint/rules/ruleset at build time). Supplying any --ruleset value
puts the project in full control of which rule packs are loaded and in
what order.
The literal value builtin is rewritten by the runner to that same
staged pack — the CLI would otherwise try to fetch the pack from a GitHub
release that does not exist for in-development opentaint SHAs (and the
current CLI's URL is malformed, producing a 404 against
api.github.com/repos/seqra/seqra/opentaint/…). Example — stack
builtin with a custom YAML file and a custom rules directory:
scan-flags:
- --ruleset
- builtin # → <build>/rules (staged)
- --ruleset
- "{ext}/my-project/rules/sql-injection.yaml" # custom YAML file
- --ruleset
- "{ext}/my-project/rules" # custom rules directoryTwo placeholders are expanded inside scan-flags:
{ext}→ absolute path ofprojects/extensions/.{rules}→ absolute path of the staged source-tree pack at<build>/rules. After thebuiltinrewrite, mostly redundant for--rulesetvalues; still useful for other flags that take a rules directory.
See projects/extensions/README.md for
the layout convention and the rationale behind the builtin rewrite.
See test-system-design-plan.md §10. The exact spelling of the
opentaint {compile,scan} --experimental --analyzer-jar / --autobuilder-jar
flags must be confirmed against opentaint --help --experimental and
updated in scripts/run_analysis.py before the workflow will run green.