Skip to content

feat: usage spend analytics, repo graph overview + TUI tabs, read mod…#34

Merged
gh-assistance[bot] merged 2 commits into
mainfrom
feat/usage-graph-readmodes-filters
Jun 24, 2026
Merged

feat: usage spend analytics, repo graph overview + TUI tabs, read mod…#34
gh-assistance[bot] merged 2 commits into
mainfrom
feat/usage-graph-readmodes-filters

Conversation

@juninmd

@juninmd juninmd commented Jun 24, 2026

Copy link
Copy Markdown
Owner

…es, filter wave 16

Pilar A — tokenix usage: absolute token spend + ≈USD cost from agent transcripts (daily/weekly/monthly/session/model/project, 5-hour blocks with burn rate, month-end forecast, --cost-mode, --statusline, --json). New src/usage.rs + shared src/transcripts.rs (conversation_audit refactored to reuse it); gain.rs ModelPrice extended with output/cache rates + price_for / usage_cost helpers.

Pilar B — tokenix graph: repo-wide hotspots (god nodes, bottlenecks, blast-radius leaders) + Graphviz DOT export (graph.rs repo_hotspots / format_repo_report / format_edges_dot). New Usage and Graph dashboard tabs.

Pilar C — tokenix read --mode full|outline|signatures|diff|density:X (entropy-filtered reads).

Filter wave 16: cargo tree, npm ls, kubectl explain, ip, ss, lsof, netstat, systemctl list-* (386 filters, 800 golden cases).

Docs: README.md + AGENTS.md updated. Tests: 263 passed, fmt clean.

Claude-Session: https://claude.ai/code/session_01Vw2xCqT8ozZKw5VtWgWAAn

…es, filter wave 16

Pilar A — `tokenix usage`: absolute token spend + ≈USD cost from agent
transcripts (daily/weekly/monthly/session/model/project, 5-hour blocks with
burn rate, month-end forecast, --cost-mode, --statusline, --json). New
src/usage.rs + shared src/transcripts.rs (conversation_audit refactored to
reuse it); gain.rs ModelPrice extended with output/cache rates + price_for /
usage_cost helpers.

Pilar B — `tokenix graph`: repo-wide hotspots (god nodes, bottlenecks,
blast-radius leaders) + Graphviz DOT export (graph.rs repo_hotspots /
format_repo_report / format_edges_dot). New Usage and Graph dashboard tabs.

Pilar C — `tokenix read --mode full|outline|signatures|diff|density:X`
(entropy-filtered reads).

Filter wave 16: cargo tree, npm ls, kubectl explain, ip, ss, lsof, netstat,
systemctl list-* (386 filters, 800 golden cases).

Docs: README.md + AGENTS.md updated. Tests: 263 passed, fmt clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Vw2xCqT8ozZKw5VtWgWAAn

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces comprehensive spend analytics via the new tokenix usage command and a repo-wide symbol-graph overview with tokenix graph, both integrated as new tabs in the interactive TUI. It also enhances the tokenix read command with new modes (such as entropy-based density filtering) and adds several bundled output filters. The review feedback highlights several key improvement opportunities: supporting standard JSON files alongside JSONL in transcript parsing, robustly extracting project names from Claude paths when cwd is missing, removing a redundant day == 0 check, properly handling execution errors for git diff, and optimizing the density filter by pre-calculating line entropy to avoid redundant computations during sorting.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread src/usage.rs
Comment on lines +175 to +192
fn parse_file(path: &Path, out: &mut Vec<Record>, seen: &mut HashSet<String>) {
let Ok(raw) = std::fs::read_to_string(path) else {
return;
};
let session_fallback = path
.file_stem()
.and_then(|s| s.to_str())
.unwrap_or("?")
.to_string();
for line in raw.lines() {
let Ok(v) = serde_json::from_str::<Value>(line) else {
continue;
};
if let Some(rec) = record_from_value(&v, &session_fallback, seen) {
out.push(rec);
}
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Currently, parse_file reads the transcript file line-by-line and attempts to parse each line as a JSON value. While this works perfectly for JSONL files, standard .json files (which are often pretty-printed or structured as a single JSON array/object) will fail to parse entirely. Trying to parse the entire file as a single JSON value first, and falling back to line-by-line parsing if that fails, ensures robust support for both formats.

fn parse_file(path: &Path, out: &mut Vec<Record>, seen: &mut HashSet<String>) {
    let Ok(raw) = std::fs::read_to_string(path) else {
        return;
    };
    let session_fallback = path
        .file_stem()
        .and_then(|s| s.to_str())
        .unwrap_or("?")
        .to_string();

    // Try parsing the entire file as a single JSON value (e.g., pretty-printed JSON or a JSON array)
    if let Ok(v) = serde_json::from_str::<Value>(&raw) {
        if let Some(arr) = v.as_array() {
            for item in arr {
                if let Some(rec) = record_from_value(item, path, &session_fallback, seen) {
                    out.push(rec);
                }
            }
        } else if let Some(rec) = record_from_value(&v, path, &session_fallback, seen) {
            out.push(rec);
        }
        return;
    }

    // Fallback to line-by-line JSONL parsing
    for line in raw.lines() {
        let Ok(v) = serde_json::from_str::<Value>(line) else {
            continue;
        };
        if let Some(rec) = record_from_value(&v, path, &session_fallback, seen) {
            out.push(rec);
        }
    }
}

Comment thread src/usage.rs
Comment on lines +194 to +266
fn record_from_value(
v: &Value,
session_fallback: &str,
seen: &mut HashSet<String>,
) -> Option<Record> {
let message = v.get("message");
let usage = message
.and_then(|m| m.get("usage"))
.or_else(|| v.get("usage"))?;

let input = u64_at(usage, "input_tokens");
let output = u64_at(usage, "output_tokens");
let cache_read = u64_at(usage, "cache_read_input_tokens");
let cache_write = u64_at(usage, "cache_creation_input_tokens");
if input + output + cache_read + cache_write == 0 {
return None;
}

// Dedup replayed lines by (message id, requestId) when both are present.
let msg_id = message
.and_then(|m| m.get("id"))
.and_then(|x| x.as_str())
.unwrap_or("");
let req_id = v.get("requestId").and_then(|x| x.as_str()).unwrap_or("");
if !msg_id.is_empty() && !req_id.is_empty() {
let key = format!("{msg_id}|{req_id}");
if !seen.insert(key) {
return None;
}
}

let ts = v
.get("timestamp")
.and_then(|x| x.as_str())
.and_then(parse_ts)
.unwrap_or_else(Local::now);

let model = message
.and_then(|m| m.get("model"))
.or_else(|| v.get("model"))
.and_then(|x| x.as_str())
.unwrap_or("unknown")
.to_string();

let project = v
.get("cwd")
.and_then(|x| x.as_str())
.map(basename)
.unwrap_or_else(|| "?".to_string());

let session = v
.get("sessionId")
.and_then(|x| x.as_str())
.unwrap_or(session_fallback)
.to_string();

let logged_cost = v
.get("costUSD")
.or_else(|| v.get("cost_usd"))
.and_then(|x| x.as_f64());

Some(Record {
ts,
model,
project,
session,
input,
output,
cache_read,
cache_write,
logged_cost,
})
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If cwd is missing or formatted differently in the transcripts, project defaults to "?", which causes these records to be silently ignored when scoped to the current project. We can improve this by passing the transcript file's path to record_from_value and extracting the project name from the Claude projects slug (e.g., ~/.claude/projects/<slug>/...) when cwd is missing. Additionally, we can defensively parse costUSD / cost_usd from a string if it is logged as a string instead of a float.

fn record_from_value(
    v: &Value,
    path: &Path,
    session_fallback: &str,
    seen: &mut HashSet<String>,
) -> Option<Record> {
    let message = v.get("message");
    let usage = message
        .and_then(|m| m.get("usage"))
        .or_else(|| v.get("usage"))?;

    let input = u64_at(usage, "input_tokens");
    let output = u64_at(usage, "output_tokens");
    let cache_read = u64_at(usage, "cache_read_input_tokens");
    let cache_write = u64_at(usage, "cache_creation_input_tokens");
    if input + output + cache_read + cache_write == 0 {
        return None;
    }

    // Dedup replayed lines by (message id, requestId) when both are present.
    let msg_id = message
        .and_then(|m| m.get("id"))
        .and_then(|x| x.as_str())
        .unwrap_or("");
    let req_id = v.get("requestId").and_then(|x| x.as_str()).unwrap_or("");
    if !msg_id.is_empty() && !req_id.is_empty() {
        let key = format!("{msg_id}|{req_id}");
        if !seen.insert(key) {
            return None;
        }
    }

    let ts = v
        .get("timestamp")
        .and_then(|x| x.as_str())
        .and_then(parse_ts)
        .unwrap_or_else(Local::now);

    let model = message
        .and_then(|m| m.get("model"))
        .or_else(|| v.get("model"))
        .and_then(|x| x.as_str())
        .unwrap_or("unknown")
        .to_string();

    let mut project = v
        .get("cwd")
        .and_then(|x| x.as_str())
        .map(basename)
        .unwrap_or_else(|| "?".to_string());

    if project == "?" {
        // Try to extract the project name from the transcript file path
        // e.g., ~/.claude/projects/<slug>/...
        let components: Vec<_> = path.components().collect();
        if let Some(pos) = components.iter().position(|c| c.as_os_str() == "projects") {
            if pos + 1 < components.len() {
                let slug = components[pos + 1].as_os_str().to_string_lossy().to_string();
                project = slug.split('-').last().unwrap_or(&slug).to_string();
            }
        }
    }

    let session = v
        .get("sessionId")
        .and_then(|x| x.as_str())
        .unwrap_or(session_fallback)
        .to_string();

    let logged_cost = v
        .get("costUSD")
        .or_else(|| v.get("cost_usd"))
        .and_then(|x| {
            x.as_f64().or_else(|| {
                x.as_str().and_then(|s| s.parse::<f64>().ok())
            })
        });

    Some(Record {
        ts,
        model,
        project,
        session,
        input,
        output,
        cache_read,
        cache_write,
        logged_cost,
    })
}

Comment thread src/usage.rs
Comment on lines +361 to +375
/// Linear month-end projection from spend so far this calendar month.
fn month_forecast(records: &[Record], mode: CostMode) -> f64 {
let now = Local::now();
let month_cost: f64 = records
.iter()
.filter(|r| r.ts.year() == now.year() && r.ts.month() == now.month())
.map(|r| r.cost(mode))
.sum();
let days_in_month = days_in_month(now.year(), now.month());
let day = now.day().max(1);
if day == 0 {
return month_cost;
}
month_cost / day as f64 * days_in_month as f64
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The check if day == 0 is redundant and represents dead code because day is initialized with now.day().max(1), which guarantees that day is always at least 1.

/// Linear month-end projection from spend so far this calendar month.
fn month_forecast(records: &[Record], mode: CostMode) -> f64 {
    let now = Local::now();
    let month_cost: f64 = records
        .iter()
        .filter(|r| r.ts.year() == now.year() && r.ts.month() == now.month())
        .map(|r| r.cost(mode))
        .sum();
    let days_in_month = days_in_month(now.year(), now.month());
    let day = now.day().max(1);
    month_cost / day as f64 * days_in_month as f64
}

Comment thread src/main.rs
Comment on lines +2035 to +2048
} else if m == "diff" {
println!("{}", chunker::generate_outline(&content, &rel));
let out = std::process::Command::new("git")
.arg("-C")
.arg(&repo_root)
.args(["diff", "--", &rel])
.output();
match out {
Ok(o) if !o.stdout.is_empty() => {
println!("\n# changed hunks");
println!("{}", String::from_utf8_lossy(&o.stdout));
}
_ => println!("\n(no uncommitted changes)"),
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the git command fails (e.g., if git is not installed or not in PATH), std::process::Command::output() returns an Err. Matching this as _ => println!("\n(no uncommitted changes)") is misleading as it hides the actual error. We should explicitly handle the Err and stderr cases to aid in debugging.

        } else if m == "diff" {
            println!("{}", chunker::generate_outline(&content, &rel));
            let out = std::process::Command::new("git")
                .arg("-C")
                .arg(&repo_root)
                .args(["diff", "--", &rel])
                .output();
            match out {
                Ok(o) if !o.stdout.is_empty() => {
                    println!("\n# changed hunks");
                    println!("{}", String::from_utf8_lossy(&o.stdout));
                }
                Ok(o) if !o.stderr.is_empty() => {
                    eprintln!("\n# git diff error");
                    eprintln!("{}", String::from_utf8_lossy(&o.stderr));
                }
                Err(e) => {
                    eprintln!("\n# failed to execute git: {e}");
                }
                _ => println!("\n(no uncommitted changes)"),
            }

Comment thread src/main.rs
Comment on lines +2074 to +2096
fn density_filter(lines: &[&str], frac: f64) -> String {
let total_tokens: usize = lines.iter().map(|l| chunker::count_tokens(l)).sum();
let budget = ((total_tokens as f64) * frac).ceil() as usize;

let mut ranked: Vec<usize> = (0..lines.len()).collect();
ranked.sort_by(|&a, &b| {
let ea = line_entropy(lines[a]);
let eb = line_entropy(lines[b]);
eb.partial_cmp(&ea)
.unwrap_or(std::cmp::Ordering::Equal)
.then(lines[b].len().cmp(&lines[a].len()))
.then(a.cmp(&b))
});

let mut keep = vec![false; lines.len()];
let mut used = 0usize;
for &i in &ranked {
if used >= budget {
break;
}
keep[i] = true;
used += chunker::count_tokens(lines[i]).max(1);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In density_filter, line_entropy is called inside the sorting comparator. Since sorting does $O(N \log N)$ comparisons, line_entropy is computed repeatedly for the same lines (e.g., $\approx 130,000$ times for a 10,000-line file). Pre-calculating the entropy for each line once reduces the complexity to exactly $O(N)$ entropy calculations, yielding a significant speedup.

fn density_filter(lines: &[&str], frac: f64) -> String {
    let total_tokens: usize = lines.iter().map(|l| chunker::count_tokens(l)).sum();
    let budget = ((total_tokens as f64) * frac).ceil() as usize;

    let mut ranked: Vec<(usize, f64)> = lines
        .iter()
        .enumerate()
        .map(|(i, line)| (i, line_entropy(line)))
        .collect();
    ranked.sort_by(|a, b| {
        b.1.partial_cmp(&a.1)
            .unwrap_or(std::cmp::Ordering::Equal)
            .then(lines[b.0].len().cmp(&lines[a.0].len()))
            .then(a.0.cmp(&b.0))
    });

    let mut keep = vec![false; lines.len()];
    let mut used = 0usize;
    for &(i, _) in &ranked {
        if used >= budget {
            break;
        }
        keep[i] = true;
        used += chunker::count_tokens(lines[i]).max(1);
    }

@gh-assistance

gh-assistance Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Aguardando pipeline

O pipeline de CI/CD está com estado pending. O merge será realizado automaticamente assim que todas as verificações passarem.

@gh-assistance

gh-assistance Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Revisao Automatica - vibe-code/opencode

Review delegated to Vibe-Code task: http://vibe-code.vibe-code.svc.cluster.local:3000/tasks/76c6f162243a4d24


🤖 Origem Automatizada

@gh-assistance gh-assistance Bot merged commit ad488d1 into main Jun 24, 2026
19 checks passed
@gh-assistance gh-assistance Bot deleted the feat/usage-graph-readmodes-filters branch June 24, 2026 20:45
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant