Skip to content

feat: replace per-repo output with a single combined markdown table#39

Merged
ghinks merged 2 commits intomainfrom
feat/combined-markdown-output
Mar 11, 2026
Merged

feat: replace per-repo output with a single combined markdown table#39
ghinks merged 2 commits intomainfrom
feat/combined-markdown-output

Conversation

@ghinks
Copy link
Owner

@ghinks ghinks commented Mar 11, 2026

Summary

  • Replaces the previous per-repository output blocks with a single markdown table that spans all repositories, sorted alphabetically by repository name.
  • Repositories that cannot be classified because they have too few PRs now appear as a note row in the table rather than being relegated to stderr.
  • JSON and CSV output formats are also unified: results from all repositories are combined into one payload with a leading repository field.

Table format

| Repository | PR # | Merged | Author | Max |Z| | Outlier Features | Title |
|---|---|---|---|---|---|---|
| `owner/repo-a` | #123 | 2024-06-15 | alice | 3.45 | additions, code_churn | Big feature |
| `owner/repo-a` | #115 | 2024-06-01 | bob | 2.12 | review_duration | Quick merge |
| `owner/repo-b` | — | — | — | — | Sample limit not reached — 18 merged PRs, need at least 30 | — |

**2 outlier(s)** found across 87 PRs in 1 classified repo(s).
**1 repo(s)** could not be classified (see above).

Key changes

  • output.py: added RepoClassifyResult dataclass; replaced format_outlier_results with format_combined_results(repo_results, format_type)
  • app.py: _detect_single now returns RepoClassifyResult instead of tuple[bool, str]; _print_detect_results accepts the full list and delegates formatting to format_combined_results

Test plan

  • uv run pytest -m "not integration" — 62 unit tests pass
  • Single repo with outliers → markdown table with outlier rows
  • Single repo with no outliers → table header + summary line only
  • Single repo below sample minimum → note row in table
  • Multiple repos → rows from all repos in one table, sorted by repo name
  • --format json → flat JSON array with repository field
  • --format csv → CSV with leading repository column

🤖 Generated with Claude Code

ghinks and others added 2 commits March 11, 2026 07:13
Produce one markdown table across all repositories, sorted by repository
name, instead of a separate block of output per repo.

Each outlier PR becomes a row: Repository | PR # | Merged | Author |
Max |Z| | Outlier Features | Title.

Repositories that do not meet the sample minimum get a single note row
in the table ("Sample limit not reached — N merged PRs, need at least M")
rather than being silently dropped or only reported on stderr.

JSON and CSV formats are also unified: results from all repositories are
combined into one array/file with a leading `repository` field, sorted
by repository name.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Compute the maximum content width for each column across all rows, then
left-pad every cell to that width. The resulting table uses standard
markdown pipe syntax and aligns correctly whether viewed raw in a
terminal or rendered by a markdown renderer.

Additional changes:
- Rename "Max |Z|" header to "Max Z" to avoid escaped-pipe noise
- Extract _truncate(), _fmt_row(), and _separator() helpers
- Apply per-column truncation limits (author 20, features 40, title 50,
  error notes 65) before width calculation so one very long value cannot
  make a column unreasonably wide

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f4abfdb892

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +107 to +108
if not repo.success:
continue

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve failed repositories in JSON/CSV output

_format_combined_json/_format_combined_csv silently drop every failed repository (if not repo.success: continue), and the new _print_detect_results path no longer prints a stderr failure summary for non-table formats, so --format json/csv can return partial data with no indication of which repos failed or why. In mixed runs this is especially ambiguous because repos with zero outliers are also absent, making failures indistinguishable from “no outliers” in the payload itself.

Useful? React with 👍 / 👎.

Comment on lines +203 to +206
return RepoClassifyResult(
repo_name=full_name,
results=results,
total_prs=len(results),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid retaining full PR result sets for every target

_detect_single now stores the entire results list (including per-PR z_scores) in each RepoClassifyResult, and callers build repo_results for all targets before printing. This changes memory usage from roughly “outliers-only strings per repo” to “all analyzed PR records across all repos,” which can cause large multi-repo/org runs to consume excessive memory or fail before output is produced.

Useful? React with 👍 / 👎.

@ghinks ghinks merged commit 1aca85f into main Mar 11, 2026
1 check passed
@ghinks ghinks deleted the feat/combined-markdown-output branch March 11, 2026 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant