Skip to content

Add argparse CLI for probe and analyze scripts#1

Open
devin-ai-integration[bot] wants to merge 1 commit into
mainfrom
devin/1780137610-add-argparse-cli
Open

Add argparse CLI for probe and analyze scripts#1
devin-ai-integration[bot] wants to merge 1 commit into
mainfrom
devin/1780137610-add-argparse-cli

Conversation

@devin-ai-integration

Copy link
Copy Markdown

Summary

Implements the CLI interface described in the README using argparse. The scripts were previously empty stubs.

Three entry points:

  • python scripts/probe.py — standalone probe CLI (--api-type, --api-key, --model, --endpoint, --claimed-model, -o, -v)
  • python scripts/analyze.py <results.json> — standalone analysis CLI (--claimed-model, -o, --json)
  • python cli.py {probe,analyze} — unified entry point with subcommands that delegates to the above

probe.py sends 8 diagnostic probes (self-identity, knowledge cutoff, strawberry count, decimal comparison, multi-step math, logic puzzle, token awareness, coding task), measures per-probe latency, captures reasoning_tokens when present, and writes structured JSON.

analyze.py scores probe responses against 16+ model signatures (MODEL_SIGNATURES dict covering GPT-3.5/4/4o/o1, Claude 3/3.5/4, Gemini, Llama, Mistral), detects tier dilution, and outputs a formatted report or JSON.

Also populates references/model_signatures.md with the known model characteristics database.

Link to Devin session: https://app.devin.ai/sessions/58a92cd61d2443fc96c7612ea0273421
Requested by: @dabaibian

- Implement probe.py with argparse: --api-type, --api-key, --model,
  --endpoint, --claimed-model, --output, --verbose flags. Sends 8
  diagnostic probes (identity, cutoff, strawberry, decimal, math,
  logic, token awareness, coding) and records results as JSON.

- Implement analyze.py with argparse: positional input_file,
  --claimed-model, --output, --json flags. Scores responses against
  16+ model signatures, detects dilution, generates human-readable
  or JSON reports.

- Add cli.py as unified entry point with 'probe' and 'analyze'
  subcommands for convenience.

- Populate references/model_signatures.md with the known model
  characteristics database.

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@devin-ai-integration

Copy link
Copy Markdown
Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment, CI, and merge conflict monitoring

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant