Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
e7ab8de
chore: add go.mod and go.sum for module support
fedir May 24, 2026
68fb869
fix: replace deprecated ioutil with io/os, switch auth to GH_TOKEN Be…
fedir May 24, 2026
15e67b4
refactor: remove goto, fix bool comparisons, drop redundant fmt.Sprintf
fedir May 24, 2026
49c67a7
ci: replace Travis CI with GitHub Actions
fedir May 24, 2026
a6a1f4f
chore: remove Travis CI config
fedir May 24, 2026
8bd12c8
docs: add AGENTS.md with project guide, CLAUDE.md refs it
fedir May 24, 2026
88f7816
feat: make -f output path required, auto-create stats/ dir
fedir May 24, 2026
cfae0dd
chore: add .env.sample, exclude .env from git
fedir May 24, 2026
2d171a3
feat: load .env automatically on startup if present
fedir May 24, 2026
2fd1373
fix: include URL in rate limit error message
fedir May 24, 2026
3b12f8d
chore: add Makefile with build, test, cache and run targets
fedir May 24, 2026
471f230
refactor: move go.test.sh coverage logic into Makefile test target
fedir May 24, 2026
ea6758a
docs: update README and AGENTS.md to reflect current tooling
fedir May 24, 2026
4acf421
feat: add per-repo progress logging with timing and overall counter
fedir May 24, 2026
f1b7221
fix: cap stats/contributors retries at 10, handle empty response in R…
fedir May 24, 2026
4879edd
feat: make HTTP timeout configurable via GH_HTTP_TIMEOUT, default 30s
fedir May 24, 2026
52e9f28
fix: init HTTP client lazily so GH_HTTP_TIMEOUT from .env is respected
fedir May 24, 2026
61fe484
feat: make stats retry count configurable via GH_STATS_MAX_RETRIES, d…
fedir May 24, 2026
e605f6b
feat: add GH_STATS_RETRY_INTERVAL env var to configure stats retry sleep
fedir May 24, 2026
9c61b35
chore: set default GH_STATS_RETRY_INTERVAL to 10s
fedir May 24, 2026
1adc189
perf: warm up stats/contributors early to avoid 202 retry delays
fedir May 24, 2026
15836c7
feat: hybrid analysis — always merge API data with local git stats
fedir May 24, 2026
37924a5
chore: update go_microservice_toolkits.csv with hybrid analysis data
fedir May 24, 2026
5ad963f
docs: document hybrid analysis, localstat package, and new env vars
fedir May 24, 2026
4654ff3
test: add competition, files, localstat, and github contributor tests
fedir May 24, 2026
41a6f76
chore: update all stats and modernize framework lists for Go, Rust, J…
fedir May 24, 2026
f776435
chore: add fresh API cache, gitignore test_data/projects/ and result.csv
fedir May 24, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
4 changes: 4 additions & 0 deletions .env.sample
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
GH_TOKEN=your_github_token_here
GH_HTTP_TIMEOUT=30
GH_STATS_MAX_RETRIES=5
GH_STATS_RETRY_INTERVAL=10
22 changes: 22 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: CI

on:
push:
branches: ["**"]
pull_request:
branches: [master]

jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version-file: go.mod
- run: go build ./...
- run: go vet ./...
- run: go test -race -coverprofile=coverage.txt ./...
- uses: codecov/codecov-action@v4
with:
files: coverage.txt
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
tmp/
test_data/projects/
ghstat
result.csv
.DS_Store
*.code-workspace
.vscode/*
.env
14 changes: 0 additions & 14 deletions .travis.yml

This file was deleted.

114 changes: 114 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# ghstat — Agent & Developer Guide

## Project overview

CLI tool for multi-criteria statistical comparison of GitHub repositories. Combines GitHub REST API data with local git clone analysis, scores each repo across ~10 criteria, and writes ranked results to a CSV file.

## Tech stack

- **Language:** Go 1.26.3
- **Module:** `github.com/fedir/ghstat`
- **Dependencies:** `github.com/tidwall/gjson`, `github.com/joho/godotenv`
- **CI:** GitHub Actions (`.github/workflows/ci.yml`)

## Repository layout

```
ghstat.go # main: CLI flags, .env loading, goroutine fan-out
struct.go # Repository struct with CSV header tags
data.go # per-repo data fetching + hybrid API/local merge
competition.go # scoring and ranking logic
files.go # CSV output, HTTP cache clearing
github/ # GitHub API client
repository.go
contributors.go
author.go
limits.go
statisitcs_contributors.go
repository_language_sorting.go
localstat/ # local git clone analysis (authoritative commit history)
localstat.go
httpcache/ # file-based HTTP response cache (SHA-256 keyed)
httpcache.go
timing/ # Unix timestamp → relative minutes helper
bin/ # shell scripts for per-category comparisons
stats/ # output CSV files (committed, updated by runs)
test_data/ # cached API responses used by tests (no network)
tmp/projects/ # local git clones (gitignored, can be several GB)
Makefile # all developer commands
.env # local secrets (gitignored)
.env.sample # template for .env
```

## Authentication

Copy `.env.sample` to `.env` and set your token:

```bash
cp .env.sample .env
# GH_TOKEN=your_github_token_here
```

The token needs `repo` scope. The app loads `.env` automatically on startup. A shell environment variable takes precedence over `.env`.

## Environment variables

| Variable | Default | Description |
|----------|---------|-------------|
| `GH_TOKEN` | — | GitHub personal access token (required) |
| `GH_HTTP_TIMEOUT` | `30` | HTTP request timeout in seconds |
| `GH_STATS_MAX_RETRIES` | `5` | Max retries when GitHub returns 202 (computing stats) |
| `GH_STATS_RETRY_INTERVAL` | `10` | Seconds between retries for 202 responses |

## Hybrid analysis

Each repository is analysed from two sources, merged into one record:

| Data | Source | Why |
|------|--------|-----|
| Stars, forks, open issues, license | GitHub API | Real-time metadata |
| Author, location, followers | GitHub API | User profile data |
| Closed issues, tags, contributors | GitHub API | GitHub-specific concepts |
| **TotalCommits, Additions, Deletions** | **Local git** | `git rev-list` / `git log --numstat` — authoritative |
| **MediumCommitSize** | **Local git** | Derived from above |
| **AverageContributionPeriod** | **Local git** | Per-author first/last commit date span |
| **ReturningContributors** | **Local git** | Authors active in >4 distinct ISO weeks |
| **CommitsByDay** | **Local git** | More accurate commit count / repo age |

On first run, each repo is fully cloned to `tmp/projects/<owner>_<repo>/`. On subsequent runs the clone is updated (`git fetch origin` + `git reset --hard origin/HEAD`). If the clone fails, API data is kept as-is.

## Common commands

```bash
make build # compile binary
make test # run tests with race detector + coverage.txt
make vet # run go vet
make rate-limit # check GitHub API quota
make cache-clear # wipe HTTP response cache (preserves clones)
make clone-clear # remove local git clones in tmp/projects/
make run-go # Go frameworks → stats/go_frameworks.csv
make run-go-microservices # Go microservice toolkits
make run-all # all categories via bin/build_all.sh
make clean # remove binary and tmp/ (preserves clones)
make help # list all targets
```

Manual run:

```bash
./ghstat -r owner/repo1,owner/repo2 -f stats/output.csv -t tmp
```

## Output

- `-f` is required — no default output path
- `stats/` is created automatically if missing
- Results are written as CSV; headers come from `header` struct tags on `Repository`

## HTTP cache

Responses cached under `-t` folder (default `test_data/`), keyed by SHA-256 of the URL. Cache is permanent until cleared with `make cache-clear` or `-cc`. Error responses (403/404) are not cached. The `stats/contributors` endpoint returns 202 when GitHub is computing stats; the client retries up to `GH_STATS_MAX_RETRIES` times with `GH_STATS_RETRY_INTERVAL` second delays. Local git stats always override the API result when available, so 202-forever repos are handled correctly.

## Conventional commits

`feat:`, `fix:`, `refactor:`, `chore:`, `ci:`, `docs:`, `test:` — single-line messages only.
1 change: 1 addition & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
See [AGENTS.md](AGENTS.md) for full project documentation.
57 changes: 57 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
.PHONY: build test vet lint clean cache-clear clone-clear rate-limit run-go run-go-microservices run-all help

BINARY := ghstat
STATS_DIR := stats
CACHE_DIR := tmp

PKGS := $(shell go list ./... 2>/dev/null | grep -v '/tmp/' | grep -v '/test_data/')

## build: compile the binary
build:
go build -o $(BINARY) .

## test: run tests with race detector and coverage
test:
@echo "" > coverage.txt
@for d in $(PKGS); do \
go test -race -coverprofile=profile.out -covermode=atomic $$d; \
if [ -f profile.out ]; then cat profile.out >> coverage.txt && rm profile.out; fi \
done

## vet: run go vet
vet:
go vet $(PKGS)

## clean: remove binary and API cache (preserves local clones in tmp/projects/)
clean:
rm -f $(BINARY)
rm -rf $(CACHE_DIR)

## cache-clear: clear HTTP response cache (preserves local clones)
cache-clear: build
./$(BINARY) -cc -t $(CACHE_DIR)

## clone-clear: remove locally cloned repositories (tmp/projects/ can be several GB)
clone-clear:
rm -rf $(CACHE_DIR)/projects

## rate-limit: show current GitHub API rate limit status
rate-limit: build
./$(BINARY) -l

## run-go: fetch and rank Go frameworks
run-go: build
./$(BINARY) -f $(STATS_DIR)/go_frameworks.csv -t $(CACHE_DIR)

## run-go-microservices: fetch and rank Go microservice toolkits
run-go-microservices: build
./$(BINARY) -r koding/kite,nytimes/gizmo,micro/go-micro,rsms/gotalk,gocircuit/circuit,go-kit/kit \
-f $(STATS_DIR)/go_microservice_toolkits.csv -t $(CACHE_DIR)

## run-all: run all framework/CMS comparisons
run-all: build
bash bin/build_all.sh

## help: show this help
help:
@grep -E '^## ' Makefile | sed 's/^## //' | column -t -s ':'
105 changes: 68 additions & 37 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,64 +1,95 @@
# ghstat

[![Build Status](https://travis-ci.org/fedir/ghstat.svg?branch=master)](https://travis-ci.org/fedir/ghstat)
[![Scrutinizer Code Quality](https://scrutinizer-ci.com/g/fedir/ghstat/badges/quality-score.png?b=master)](https://scrutinizer-ci.com/g/fedir/ghstat/?branch=master)
[![CI](https://github.com/fedir/ghstat/actions/workflows/ci.yml/badge.svg)](https://github.com/fedir/ghstat/actions/workflows/ci.yml)
[![Go Report Card](https://goreportcard.com/badge/github.com/fedir/ghstat)](https://goreportcard.com/report/github.com/fedir/ghstat)
[![Maintainability](https://api.codeclimate.com/v1/badges/572b4413f5c5ebf49e36/maintainability)](https://codeclimate.com/github/fedir/go-github-statistics/maintainability)
[![codecov](https://codecov.io/gh/fedir/ghstat/branch/master/graph/badge.svg)](https://codecov.io/gh/fedir/ghstat)
[![GoDoc](https://godoc.org/github.com/fedir/ghstat?status.svg)](https://godoc.org/github.com/fedir/ghstat)
[![License: GPL v3](https://img.shields.io/badge/License-GPL%20v3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)

Statistical multi-criteria decision-making comparator for selected Github's projects.
Statistical multi-criteria decision-making comparator for GitHub projects. Combines GitHub REST API data with local git clone analysis for accurate historical commit statistics.

Project's overview was given on Open Source Summit Europe 2018 "Methodology of Multi-Criteria Comparison and Typology of Open Source Projects" - https://events.linuxfoundation.org/wp-content/uploads/2017/12/Methodology-of-Multi-Criteria-Comparison-and-Typology-of-Open-Source-Project-Fedir-Rykhtik-Stratis-1.pdf
Project overview was presented at Open Source Summit Europe 2018 — ["Methodology of Multi-Criteria Comparison and Typology of Open Source Projects"](https://events.linuxfoundation.org/wp-content/uploads/2017/12/Methodology-of-Multi-Criteria-Comparison-and-Typology-of-Open-Source-Project-Fedir-Rykhtik-Stratis-1.pdf).

## Getting started

Installation instruction:
**1. Generate a GitHub token**

* Generate a token for Your GitHub account: https://github.com/settings/tokens
* Select following scope: `repo` and all it's sub-scopes
* Build the app
* Configure the project with environment variables
* Launch
Go to https://github.com/settings/tokens and create a token with `repo` scope.

go get -u -v github.com/fedir/ghstat
cd [package location]
go build
mkdir tmp
export GH_USR="your_gh_username" && export GH_PASS="your_gh_api_token"
./ghstat
**2. Clone and configure**

The project contains already some data received from Github API for local testing and debugging, but You could update it in the following way:
```bash
git clone https://github.com/fedir/ghstat
cd ghstat
cp .env.sample .env
# edit .env and set your token
```

./ghstat --cc
bash bin/build_all.sh
**3. Build and run**

If You have timeouts, You could check the rate limit with :
```bash
make build
make run-go
```

./ghstat -l
Output is written to `stats/go_frameworks.csv`.

Usage example to compare most famous JS frameworks
## Usage

./ghstat -r angular/angular,facebook/react,vuejs/vue
```bash
make help # list all available commands
make rate-limit # check GitHub API quota
make cache-clear # wipe HTTP response cache (preserves local clones)
make clone-clear # remove local git clones in tmp/projects/
make run-go # compare Go frameworks
make run-go-microservices # compare Go microservice toolkits
make run-all # run all comparisons
make test # run tests with coverage
```

Usage example to compare most famous PHP frameworks
Custom comparison:

./ghstat -r laravel/framework,symfony/symfony,yiisoft/yii2,bcit-ci/CodeIgniter
```bash
./ghstat -r angular/angular,facebook/react,vuejs/vue -f stats/js.csv -t tmp
```

After that, `result.csv` file will be created (or updated, if it's already exists) with the statistics of selected repositories.
## Flags

## Comparaison methodology
| Flag | Default | Description |
|------|---------|-------------|
| `-r` | Go frameworks | Comma-separated list of `owner/repo` |
| `-f` | *(required)* | Output CSV file path |
| `-t` | `test_data` | Cache folder |
| `-l` | | Check GitHub rate limit |
| `-cc` | | Clear HTTP cache |
| `-ccdr` | | Dry-run cache clear |
| `-d` | | Debug mode |

At the moment We choosed following metrics, here they are, in alphabetical order :
## How it works

* Active forkers percentage - more is better
* Age in days - newest is better :)
* Closed issues, % - more is better
* Watchers - more is better
* Total commits - more is better
* More precisely, it's total commits by existing contributors, commits of deleted accounts, will not be taken in account
Each repository is analysed from two sources:

- **GitHub API** — real-time data: stars, forks, issues, license, author profile, closed issues, tags, contributors
- **Local git clone** — authoritative history: commit count, additions/deletions, commit size, contribution period, returning contributors

On first run repos are cloned to `tmp/projects/`. On subsequent runs the clones are updated. Local stats override API stats when available, so repositories where GitHub's stats API returns 202 (inactive repos) still get accurate data.

## Comparison methodology

Each repository is scored across these criteria (more is better unless noted):

- **Stargazers** — popularity
- **Age** — newest is better
- **Total commits** — activity (from local git)
- **Closed issues %** — maintenance quality
- **Commits/day** — development pace (from local git)
- **Top 10 contributors followers** — community notability
- **Active forkers %** — engagement
- **Returning contributors** — project retention (from local git)
- **Average contribution period** — contributor loyalty (from local git)
- **Total releases** — release cadence

A final overall placement is computed by summing individual rankings.

## Ratings

[Detailed statistics with ratings made with ghstat](https://github.com/fedir/ghstat/blob/master/ratings.md)
[Detailed statistics with ratings](https://github.com/fedir/ghstat/blob/master/ratings.md)
18 changes: 8 additions & 10 deletions bin/all_cms.sh
Original file line number Diff line number Diff line change
@@ -1,14 +1,12 @@
echo "## Cross-language CMS rating"
echo ""
./ghstat -r \
dotCMS/core,alkacon/opencms-core,gentics/mesh,Softmotions/ncms,liferay/liferay-portal,\
bogeblad/infoglue,nuxeo/nuxeo,lutece-platform/lutece-core,alkacon/opencms-core,exoplatform/ecms,\
Victoire/victoire,backbee/backbee-php,bolt/bolt,concrete5/concrete5,contao/core,\
forkcms/forkcms,getgrav/grav,joomla/joomla-cms,octobercms/october,pagekit/pagekit,redkite-labs/RedKiteCms,roadiz/roadiz,sulu/sulu-standard,\
spip/SPIP,neos/neos-development-collection,WordPress/WordPress,modxcms/revolution,novius-os/novius-os,\
LavaLite/cms,picocms/Pico,daylightstudio/FUEL-CMS,thelia/thelia,typicms/base,AsgardCms/Platform,odirleiborgert/borgert-cms,redaxscript/redaxscript,getkirby/starterkit,processwire/processwire,\
symfony-cmf/symfony-cmf,zikula/core,TYPO3/TYPO3.CMS,drupal/drupal,\
keystonejs/keystone,Dynalon/mdwiki,directus/directus,strapi/strapi,netlify/netlify-cms,apostrophecms/apostrophe\
-f stats/all_cms.csv
WordPress/WordPress,drupal/drupal,joomla/joomla-cms,getgrav/grav,craftcms/cms,statamic/cms,octobercms/october,\
TYPO3/TYPO3.CMS,concrete5/concrete5,neos/neos-development-collection,processwire/processwire,\
contao/core,modxcms/revolution,getkirby/starterkit,picocms/Pico,forkcms/forkcms,zikula/core,sulu/sulu-standard,\
keystonejs/keystone,directus/directus,strapi/strapi,netlify/netlify-cms,apostrophecms/apostrophe,\
dotCMS/core,alkacon/opencms-core,gentics/mesh,\
nuxeo/nuxeo,lutece-platform/lutece-core,exoplatform/ecms \
-f stats/all_cms.csv
echo "[Detailed cross-language CMS statistics with ratings](https://github.com/fedir/ghstat/blob/master/stats/all_cms.csv)"
echo ""
echo ""
Loading
Loading