feat: replace per-repo output with a single combined markdown table#39
feat: replace per-repo output with a single combined markdown table#39
Conversation
Produce one markdown table across all repositories, sorted by repository
name, instead of a separate block of output per repo.
Each outlier PR becomes a row: Repository | PR # | Merged | Author |
Max |Z| | Outlier Features | Title.
Repositories that do not meet the sample minimum get a single note row
in the table ("Sample limit not reached — N merged PRs, need at least M")
rather than being silently dropped or only reported on stderr.
JSON and CSV formats are also unified: results from all repositories are
combined into one array/file with a leading `repository` field, sorted
by repository name.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Compute the maximum content width for each column across all rows, then left-pad every cell to that width. The resulting table uses standard markdown pipe syntax and aligns correctly whether viewed raw in a terminal or rendered by a markdown renderer. Additional changes: - Rename "Max |Z|" header to "Max Z" to avoid escaped-pipe noise - Extract _truncate(), _fmt_row(), and _separator() helpers - Apply per-column truncation limits (author 20, features 40, title 50, error notes 65) before width calculation so one very long value cannot make a column unreasonably wide Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f4abfdb892
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if not repo.success: | ||
| continue |
There was a problem hiding this comment.
Preserve failed repositories in JSON/CSV output
_format_combined_json/_format_combined_csv silently drop every failed repository (if not repo.success: continue), and the new _print_detect_results path no longer prints a stderr failure summary for non-table formats, so --format json/csv can return partial data with no indication of which repos failed or why. In mixed runs this is especially ambiguous because repos with zero outliers are also absent, making failures indistinguishable from “no outliers” in the payload itself.
Useful? React with 👍 / 👎.
| return RepoClassifyResult( | ||
| repo_name=full_name, | ||
| results=results, | ||
| total_prs=len(results), |
There was a problem hiding this comment.
Avoid retaining full PR result sets for every target
_detect_single now stores the entire results list (including per-PR z_scores) in each RepoClassifyResult, and callers build repo_results for all targets before printing. This changes memory usage from roughly “outliers-only strings per repo” to “all analyzed PR records across all repos,” which can cause large multi-repo/org runs to consume excessive memory or fail before output is produced.
Useful? React with 👍 / 👎.
Summary
repositoryfield.Table format
Key changes
output.py: addedRepoClassifyResultdataclass; replacedformat_outlier_resultswithformat_combined_results(repo_results, format_type)app.py:_detect_singlenow returnsRepoClassifyResultinstead oftuple[bool, str];_print_detect_resultsaccepts the full list and delegates formatting toformat_combined_resultsTest plan
uv run pytest -m "not integration"— 62 unit tests pass--format json→ flat JSON array withrepositoryfield--format csv→ CSV with leadingrepositorycolumn🤖 Generated with Claude Code