Ollama Model Tester

A small, dependency-free CLI for running the same prompt against your local Ollama models and saving every response to disk — so you can compare models (or compare repeated runs of one model) side by side.

It uses only the Python standard library: no pip install required.

Requirements

Python 3.7 or newer
Ollama running locally (the default http://localhost:11434)
At least one model pulled, e.g. ollama pull llama3.1:8b

Quick start

Make sure Ollama is running, then:

python3 ollama_model_test.py

You'll be asked, in order:

Which model to use (pick a number from your installed models)
The prompt — type as many lines as you like, then put /done on its own line to finish
How many times to run the prompt
Temperature (0.0–2.0), or press Enter to use Ollama's default
Whether to stream the responses live to the terminal

It then runs the prompt the requested number of times and writes the results under ollama-runs/.

Command-line flags (optional)

Every prompt above can be supplied up front, which makes the tool scriptable. Anything you omit is still asked interactively.

Flag	Description
`--model NAME`	Local model to use (must already be installed)
`--runs N`	Number of generations to run
`--temperature T`	Temperature, `0.0`–`2.0`
`--prompt-file PATH`	Read the prompt from a UTF-8 text file
`--stream` / `--no-stream`	Stream responses live, or don't

Example — run a saved prompt three times, fully non-interactive:

python3 ollama_model_test.py \
  --model llama3.1:8b \
  --prompt-file prompt.txt \
  --runs 3 \
  --temperature 0.7 \
  --no-stream

Output

Results are grouped into one folder per prompt:

ollama-runs/
  what-are-the-main-tradeoffs-between_835562a4/
    prompt.md         # the prompt, with its hash and timestamp
    metadata.json     # every run against this prompt (model, timing, options)
    llama3.1-8b.md    # responses + Ollama metadata for this model
    gemma3-1b.md

The folder name is the first few words of the prompt plus a short hash of the full prompt. Because the folder is keyed on the prompt, running the same prompt against a different model drops its output into the same folder — making model-to-model comparison easy. Each model's file records every run's response alongside Ollama's run metadata (token counts, timings, and so on).

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
ollama_model_test.py		ollama_model_test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ollama Model Tester

Requirements

Quick start

Command-line flags (optional)

Output

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Ollama Model Tester

Requirements

Quick start

Command-line flags (optional)

Output

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages