Add Gemma 4, LFM2, and Qwen3-Coder models to catalog

## Summary

The catalog is missing several high-performing models that are available on mlx-community and score well on agentic coding benchmarks (per [mlx_transformers_benchmark](https://github.com/weklund-agent/mlx_transformers_benchmark)):

| Model | Params | Quality (M4 Pro 64GB) | Gen tok/s | RAM (int4) |
|-------|--------|----------------------|-----------|------------|
| **Qwen3-Coder-30B-A3B** | 30B MoE (3B active) | 65.7% | 80 | 17.8 GB |
| **Gemma 4 E2B-it** | 2.3B dense | 65.3% | 121 | 3.5 GB |
| **LFM2-24B-A2B** | 24B MoE (2.3B active) | 67.3% | 117 | 14.2 GB |

On M5 Max 128GB, Qwen3-Coder-30B-A3B is the top model at 75.5% quality / 129 tok/s.

### HuggingFace sources

**Qwen3-Coder-30B-A3B:**
- int4: `mlx-community/Qwen3-Coder-30B-A3B-Instruct-4bit`
- int8: `mlx-community/Qwen3-Coder-30B-A3B-Instruct-8bit`
- bf16: `Qwen/Qwen3-Coder-30B-A3B-Instruct`
- Capabilities: tool_calling (hermes), thinking (qwen3)

**Gemma 4 E2B-it:**
- int4: `mlx-community/gemma-4-e2b-it-4bit`
- int8: `mlx-community/gemma-4-e2b-it-8bit`
- bf16: `google/gemma-4-E2B-it`
- Note: blocked by mlx_lm gemma4 arch support — see related issue

**LFM2-24B-A2B:**
- int4: `mlx-community/LFM2-24B-A2B-4bit`
- int8: `LiquidAI/LFM2-24B-A2B-MLX-8bit` (published by LiquidAI, not mlx-community)
- bf16: `LiquidAI/LFM2-24B-A2B-MLX-bf16`
- Architecture: mamba2-hybrid

### Notes

- None of these models are gated on HuggingFace
- Gemma 4 is currently blocked by vllm-mlx's bundled mlx_lm not supporting the `gemma4` architecture (separate issue)
- All three were manually added to catalog YAML during testing and work correctly (except Gemma 4 due to the mlx_lm issue)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Gemma 4, LFM2, and Qwen3-Coder models to catalog #47

Summary

HuggingFace sources

Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Model	Params	Quality (M4 Pro 64GB)	Gen tok/s	RAM (int4)
Qwen3-Coder-30B-A3B	30B MoE (3B active)	65.7%	80	17.8 GB
Gemma 4 E2B-it	2.3B dense	65.3%	121	3.5 GB
LFM2-24B-A2B	24B MoE (2.3B active)	67.3%	117	14.2 GB

Add Gemma 4, LFM2, and Qwen3-Coder models to catalog #47

Description

Summary

HuggingFace sources

Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions