Skip to content

External reproduction #2: Ubuntu/Linux + Python 3.11 #44

@weich97

Description

@weich97

Goal

Run the TradeArena v0.2 no-key external reproduction pack on a fresh Linux environment with Python 3.11 and report whether the deterministic artifacts reproduce cleanly.

Target

  • Repository: weich97/TradeArena
  • Release tag: v0.2.0
  • Expected path: outputs/reproduction/v0_2_external/manifest.json

Commands

git clone https://github.com/weich97/TradeArena.git
cd TradeArena
git checkout v0.2.0
python3.11 -m pip install -e ".[dev]"
python scripts/validate_benchmark_spec.py benchmarks/v0.2/spec.json
python scripts/run_external_reproduction_pack.py --output-dir outputs/reproduction/v0_2_external
python scripts/check_release_readiness.py

Acceptance Criteria

Please report:

  • Linux distribution, kernel, CPU architecture, Python version, and install method.
  • Commit/tag used and whether the working tree was clean before running.
  • Full command log, including any failed command and traceback.
  • outputs/reproduction/v0_2_external/manifest.json path and key fields.
  • Benchmark spec canonical hash.
  • Trajectory reproducibility hash and file SHA-256.
  • Audit report, Agent Autopsy Dashboard, failure autopsy, and benchmark-row artifact hashes.
  • Whether live APIs, downloaded market data, private fills, or broker data were used.
  • Whether outputs/examples/audit_report.html and outputs/examples/agent_autopsy_dashboard.html opened successfully in a browser or via a local static-file preview.

Expected Reproduction Target

The no-key trajectory reproducibility hash should match:

sha256:bf3b1084aeec89f3bf0f99ab91b6c16a989dc8c8a29d9e93c8c72109548e442f

If it does not match, attach the manifest and logs so we can classify the drift.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions