feat: Gemma 4 on-device inference via LiteRT-LM (Android) by sparkleMing · Pull Request #4 · memex-lab/memex

sparkleMing · 2026-04-08T04:31:08Z

Summary

Add support for running Gemma 4 models fully on-device on Android using the official LiteRT-LM Kotlin API.

What's new

Android native layer

LiteRtLmPlugin.kt: Flutter platform channel plugin wrapping LiteRT-LM. Supports queue-based inference, streaming tokens, tool calls, thinking mode, model download via OkHttp, and M4A→PCM WAV audio conversion via MediaCodec.

Dart layer

GemmaLocalClient: LLMClient implementation using platform channels. Acquires a global inference lock before each request.
GemmaModelManager: Engine lifecycle manager. Vision/audio backends enabled strictly on demand — only when the request contains image/audio content. Engine fully torn down and rebuilt when backend config changes. Rebuild always happens after acquiring the lock to prevent teardown during active inference.

Provider integration

New typeGemmaLocal provider type with model list (gemma-4-e2b, gemma-4-e4b)
Model download UI in setup and settings pages (Android only)
No API key or base URL required; LLM data sharing consent skipped for on-device models

Other fixes

asset_analysis_tool: Gemma 4 uses JPEG + 896px max side to avoid LiteRT-LM patch count overflow. Non-Gemma path unchanged.
pkm_skill / timeline_card_skill: Use state.metadata factId as fallback when model-provided fact_id is unreliable.

Dependency upgrades

drift 2.30 to 2.32.1, sqlite3_flutter_libs 0.5 to 0.6, drift_flutter 0.2 to 0.3

- Add LiteRtLmPlugin (Kotlin) wrapping official LiteRT-LM API with queue-based inference, download support, and audio PCM conversion - Add GemmaLocalClient (Dart) with per-request engine init/teardown - Add GemmaModelManager with on-demand backend init: vision/audio backends only enabled when request contains image/audio content, engine rebuilt (with full teardown) when config changes - Engine rebuild happens after acquiring inference lock to prevent teardown while another inference is in progress - Add gemma_local provider type with model list (gemma-4-e2b/e4b) - Add download UI in model config pages (Android only) - Skip LLM data sharing consent for on-device models - asset_analysis_tool: use JPEG + 896px cap for Gemma 4 to avoid LiteRT-LM patch count overflow; non-Gemma path unchanged - pkm_skill/timeline_card_skill: use state factId as fallback when model-provided fact_id is unreliable - Upgrade drift 2.30→2.32.1, sqlite3_flutter_libs 0.5→0.6, drift_flutter 0.2→0.3

sparkleMing added 2 commits April 8, 2026 12:29

docs: add Gemma on-device provider to README tables

e5a19f8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Gemma 4 on-device inference via LiteRT-LM (Android)#4

feat: Gemma 4 on-device inference via LiteRT-LM (Android)#4
sparkleMing wants to merge 2 commits intomainfrom
feat/gemma4-litert-lm

sparkleMing commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sparkleMing commented Apr 8, 2026

Summary

What's new

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant