Skip to content

docs: comprehensive README accuracy rewrite#30

Merged
yxbh merged 1 commit intomainfrom
docs/readme-accuracy-rewrite
Mar 6, 2026
Merged

docs: comprehensive README accuracy rewrite#30
yxbh merged 1 commit intomainfrom
docs/readme-accuracy-rewrite

Conversation

@yxbh
Copy link
Copy Markdown
Owner

@yxbh yxbh commented Mar 6, 2026

docs: comprehensive README accuracy rewrite

Summary

Rewrites README.md to match the actual codebase after a 6-model multi-pass review (claude-opus-4.6, claude-sonnet-4.6, gpt-5.4, gpt-5.3-codex, gpt-5.2, gemini-3-pro-preview). Two iterations were required to reach convergence (all dimensions 8+/10). Fixes 19 findings spanning factual errors, missing features, and incomplete schema documentation.

Background

The README had drifted from the implementation — CLI flags were undocumented, the JSON schema example omitted fields, thresholds were wrong, and special-feature categories were incomplete. This was identified during a structured "iterate-hardcore" review that cross-references every README claim against the source code.

Changes

CLI documentation:

  • Add --visible-only flag to remux and archive command docs
  • Add --mkvmerge-path and --ffmpeg-path options
  • Add --ffmpeg to Installation requirements
  • Document BDMV parent directory auto-resolve behavior
  • Add commentary to special-feature category lists

Factual corrections:

  • Fix Play All decomposition threshold: >10 min≥5 min (matches _EPISODE_ITEM_MIN_S = 300)
  • Fix archive filename pattern to {stem}-{index:03d}-{clip_id}.{ext}
  • Fix fixture count: 29 → 28
  • Fix {name} source: disc title from META/DL/bdmt_eng.xml with bdmt_*.xml fallback, then folder name
  • Fix pipeline: "Cluster by duration" → "Classify playlists" (cluster_by_duration is never called)
  • Move disc title extraction into step 3 (where it actually occurs in code)

JSON schema example:

  • Add in_time, out_time, segment_key to play items
  • Add play_item_ref, duration_ms to chapters
  • Add top-level streams array on playlists
  • Add segments array on episodes (alongside scenes)
  • Add context on warnings
  • Document conditional chapter_start field on special features

Confidence table:

  • Add IG chapter marks boost to all strategies (code is unconditional)
  • Add Title-hint collapse (0.85) and Variant-dedup collapse (0.85) rows
  • Note all boosts capped at 1.0

Other:

  • Update explain example to show [visible]/[hidden] labels and total/visible counts
  • Add menu_visible to scan output description
  • Add digital_archive.py, remux/, util/, .github/ to project structure tree
  • Add ruff commands and version (v0.1.0) to Development section
  • Fix scene extraction description (IG chapter marks, not "title hints")

Testing

  • ruff check . — all checks passed
  • ruff format --check . — 69 files already formatted
  • pytest tests/ -q — 452 passed
  • Every README claim verified against source code by 6 independent AI models

Verify all claims against actual codebase via 6-model review (2 iterations).

Key fixes:
- Add --visible-only, --mkvmerge-path, --ffmpeg-path CLI flags
- Fix Play All threshold (>10min -> >=5min) and archive filename pattern
- Add commentary category, menu_visible field, disc title fallback chain
- Complete JSON schema example (segments, in_time, out_time, segment_key,
  play_item_ref, chapter_start, context, top-level streams)
- Fix confidence table (0.85 collapse paths, universal IG boost)
- Fix pipeline description (classify not cluster, correct step ordering)
- Fix fixture count (28 not 29), add digital_archive.py to tree
- Document BDMV parent dir auto-resolve, ffmpeg requirement

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@yxbh yxbh merged commit f708598 into main Mar 6, 2026
1 check passed
@yxbh yxbh deleted the docs/readme-accuracy-rewrite branch March 6, 2026 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant