Model Selection Guide

Gemma CLI supports the full range of Gemma 3 models. Each model is optimized for different tasks, ranging from complex architectural reasoning to ultra-fast text processing.

You can switch models at any time during a session using the /model command.

🏗️ Model Tiers & Capabilities

The workstation categorizes models into two main tiers to ensure they receive the appropriate level of instruction detail (see Tool Development Guide).

🧠 Major Models (High Logic)

Best for coding, complex system administration, and chain-of-thought reasoning.

Model ID	Label	Tool Limit	Primary Use Case
`3-27b`	Gemma 3 27B	12	The flagship model. Use for heavy logic, debugging, and complex tool chaining.
`3-12b`	Gemma 3 12B	8	The "Sweet Spot." Excellent balance of speed and reasoning depth.

⚡ Minor Models (Fast / Multimodal)

Best for quick lookups, simple file operations, and low-latency interaction.

Model ID	Label	Tool Limit	Primary Use Case
`3-4b`	Gemma 3 4B	2	Fast, multimodal-capable model. Great for image analysis and quick summaries.
`3n-e4b`	Gemma 3n E4B	2	Optimized for high-fidelity multimodal reasoning.
`3n-e2b`	Gemma 3n E2B	2	Ultra-low latency. Use when speed is more important than deep logic.
`3-1b`	Gemma 3 1B	0	Tiny, text-only model. Ideal for routing or very basic text cleanup.

🛠️ How Tiers Affect Tools

Gemma CLI uses a Dynamic Tiered Guidance system. When you switch models, the workstation automatically updates the system prompt:

Major Models receive Detailed Guidance: They are taught the nuances of tools, including recursive searching, wildcard patterns, and advanced error recovery.
Minor Models receive Simplified Guidance: They are given "Basic Use" instructions to prevent them from becoming overwhelmed by long system prompts, ensuring they stay fast and accurate for their size.

❓ Which model should I use?

"I'm building a new tool or script": Use 3-27b. You want the maximum intelligence for writing code and understanding directory structures.
"I just want to search the web or search my directory files": Use 3-12b or 3-4b. They are faster and more than capable of these tasks.
"I need to analyze a screenshot": Use 3-4b or 3n-e4b. These are specifically tuned for multimodal input.
"I'm on a very limited API quota": Use 3-1b or 3n-e2b. They consume fewer tokens.
"I'm not sure what model to use? What's best?": Use 3-27b.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model Selection Guide

Model Selection Guide

🏗️ Model Tiers & Capabilities

🧠 Major Models (High Logic)

⚡ Minor Models (Fast / Multimodal)

🛠️ How Tiers Affect Tools

❓ Which model should I use?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

🚀 Getting Started

🛠 Reference

👨‍💻 Development

🧠 Advanced Features

🆘 Support

Clone this wiki locally