TacitNode 🏗️⚡

A hybrid edge-to-cloud AI copilot for industrial field workers.

TacitNode bridges the "Great Crew Change" knowledge gap by acting as a digital mentor that looks over a junior technician's shoulder. It processes real-time camera feeds on-device for instant, offline-capable guidance — and intelligently escalates complex diagnostics to the cloud when needed.

Screenshots

TacitNode in action: Local inference with green indicators, cloud escalation with amber indicators, real-time metrics dashboard, enhanced debug console, and offline mode support.

Architecture

┌─────────────────────────────────────────────────────────────┐
│                     TacitNode App                           │
│                                                             │
│  📷 Camera Feed + 🎯 Demo Controls                          │
│       │                                                     │
│       ▼                                                     │
│  🧠 FunctionGemma (functiongemma-270m - on-device)          │
│       │                                                     │
│       ├── validate_routine_step ──▶ 👁️ Local Vision        │
│       │                             (lfm2-vl-450m)         │
│       │                             ⚡ ~45ms | 168 tok/s    │
│       │                             💾 245 MB RAM          │
│       │                                                     │
│       └── escalate_to_expert ────▶ ☁️ Gemini 2.5 Flash     │
│                                      ~1.2s | $0.0000875    │
│                                                             │
│  📊 Metrics Dashboard (cost savings, latency tracking)      │
│  🖥️ Debug Console (JSON routing decisions + filters)        │
│  ✈️ Offline Mode (local-only inference)                     │
└─────────────────────────────────────────────────────────────┘

Key Features

Core Capabilities

On-device function calling via Cactus Compute — FunctionGemma routes queries at 168 tok/s with ~45ms latency
Intelligent cloud escalation — automatically routes complex diagnostics to Gemini 2.5 Flash when local models can't handle it
Hybrid architecture — 3x cost savings vs pure cloud, with instant local responses and expert cloud analysis
Offline-capable — local inference works without any network connection, critical for remote industrial sites
Camera-first interface — full-screen live feed with glassmorphism overlays, designed for hands-free field use

Demo-Ready Features

Visual routing indicators — animated pulse showing local (green) vs cloud (amber) routing decisions in real-time
Performance metrics — live display of latency, tokens/sec, cost per query, and cumulative savings
Demo control panel — one-tap presets for reliable demos: Quick ID, Diagnose, Offline Test
Metrics dashboard — session statistics showing local vs cloud query distribution and cost comparison
Enhanced debug console — collapsible JSON viewer with syntax highlighting and filter chips (All, Routing, Warnings, Errors)
Offline mode simulation — toggle airplane mode for demos without disconnecting network

Performance Benchmarks

Metric	Local Inference	Cloud Escalation	Improvement
Latency	~45ms	~1,200ms	26x faster
Tokens/sec	168 tok/s	N/A (network bound)	—
Cost per query	$0.00	~$0.0000875	100% savings
Offline capable	✅ Yes	❌ No	—
RAM usage	~245 MB	Minimal	—

Hybrid Architecture Benefits:

3x cost reduction vs pure cloud (with typical 67% local / 33% cloud split)
26x faster for routine identification queries
100% offline capable for local queries
Automatic fallback ensures reliability

Tech Stack

Layer	Technology
Frontend	Flutter (Dart)
Local Routing Model	Cactus SDK (`functiongemma-270m`)
Local Vision Model	Cactus SDK (`lfm2-vl-450m`)
Cloud Fallback	Gemini 2.5 Flash API
Camera	`camera` package
Connectivity	`connectivity_plus`
Secrets	`flutter_dotenv` (`.env` gitignored)

Getting Started

Prerequisites

Flutter SDK ^3.10.1
A physical device with a camera (iOS or Android) for full demo
A Gemini API key for cloud escalation

Setup

# Clone the repo
git clone https://github.com/YOUR_USERNAME/tacit_node.git
cd tacit_node

# Install dependencies
flutter pub get

# Create your .env file
echo 'GEMINI_API_KEY=your_key_here' > .env

# Run on a connected device
flutter run

Note: On first launch, the app downloads both the routing model (functiongemma-270m) and the vision model (lfm2-vl-450m) (~700 MB total). This requires a one-time internet connection.

Platform Setup

Platform	Required Config
Android	Camera + Internet permissions (pre-configured in `AndroidManifest.xml`)
iOS	`NSCameraUsageDescription` (pre-configured in `Info.plist`)
macOS	Network client entitlement (pre-configured). No camera support — runs in text-only mode.

Project Structure

lib/
├── main.dart                          # App entry, theme, .env loading
├── models/
│   ├── routing_decision.dart          # RoutingDecision with performance metrics
│   ├── session_metrics.dart           # Cumulative session statistics
│   └── demo_preset.dart               # Demo scenario presets
├── screens/
│   └── copilot_screen.dart            # Full-screen camera + overlay UI
├── services/
│   ├── camera_service.dart            # Camera lifecycle, frame capture
│   ├── cloud_service.dart             # Gemini 2.5 Flash API integration
│   ├── copilot_service.dart           # Core orchestrator (Cactus LLM + routing)
│   ├── metrics_service.dart           # Session-wide metrics tracking
│   └── connectivity_service.dart      # Network status + offline simulation
└── widgets/
    ├── debug_console.dart             # Enhanced JSON viewer with filters
    ├── model_status_bar.dart          # Status bar with offline indicator
    ├── routing_indicator.dart         # Animated routing decision display
    ├── metrics_overlay.dart           # Session statistics dashboard
    ├── demo_controls_fab.dart         # Expandable demo preset controls
    └── offline_banner.dart            # Offline mode notification

How Routing Works

TacitNode uses a sophisticated 7-step routing pipeline:

User asks a question (e.g., "What is this?") while pointing camera at equipment
FunctionGemma analyzes intent (~168 tok/s) and selects appropriate tool:
- validate_routine_step → Local identification
- escalate_to_expert → Cloud diagnosis
- answer_query → Direct response
Visual feedback displays:
- Green pulse animation: "Analyzing locally..."
- Amber pulse animation: "Escalating to expert..."
Tool execution:
- Local path: Camera frame → Vision Model (lfm2-vl-450m) → Component ID (~45ms)
- Cloud path: Frame + query → Gemini 2.5 Flash → Expert analysis (~1.2s)
Response card shows:
- Routing type (⚡ Local or ☁️ Cloud)
- Performance metrics (latency, tokens/sec, cost)
- Routing path taken
- Cost savings for local queries
Metrics tracking:
- All queries logged in session dashboard
- Cumulative cost comparison (cloud-only vs hybrid)
- Offline query counter
Automatic fallback: If local inference fails → cloud escalation as safety net

Usage Guide

Quick Start with Presets

Tap the FAB (floating action button, bottom-right with flask icon)
Select a preset:
- Quick ID (green) → Instant local identification with metrics
- Diagnose (amber) → Cloud escalation for complex analysis
- Offline Test (blue) → Simulates airplane mode, local-only inference

Basic Usage

Local inference — Point camera at component (LED, breadboard, Arduino). Type "What is this?" → watch green pulse → instant response with latency/TPS metrics ⚡
Cloud escalation — Type "Why is this circuit failing?" → watch amber pulse → Gemini analysis with cost display ☁️
Offline mode — Tap FAB → Offline Test preset → see offline banner → local inference still works ✈️
Metrics dashboard — Tap FAB → Analytics button → view session stats, cost comparison, savings 📊
Debug console — Expand console at bottom → tap routing entries → view JSON with tool calls 🖥️

Key Features to Explore

Visual routing indicators — Green vs amber pulse animations
Performance metrics — 45ms local vs 1.2s cloud latency
Cost savings — Real-time calculation of hybrid vs cloud-only cost (3x savings)
Offline capability — Works without network connection
Technical depth — JSON viewer showing exact tool calls and routing logic

Features

Visual Enhancements

✅ Animated routing indicators (green pulse for local, amber for cloud)
✅ Performance metrics badges on every response
✅ Glassmorphism UI with smooth transitions
✅ Color-coded routing decisions throughout
✅ High-resolution app icons and splash screens

Interactive Controls

✅ One-tap presets (Quick ID, Diagnose, Offline Test)
✅ Metrics reset button for fresh sessions
✅ Offline mode simulation toggle
✅ Expandable FAB with staggered animations
✅ Mutual exclusivity (FAB/metrics can't both be open)

Metrics & Analytics

✅ Session statistics dashboard
✅ Cost comparison (cloud-only vs hybrid)
✅ Cumulative savings tracker with 5-decimal precision
✅ Offline query counter
✅ Average latency display
✅ Detailed logging for verification

Developer Tools

✅ Enhanced debug console with JSON viewer
✅ Collapsible routing entries (120px → 336px)
✅ Filter chips (All, Routing, Warnings, Errors)
✅ Syntax-highlighted tool calls
✅ Full observability of routing decisions
✅ Cloud response logging

Troubleshooting

Models not downloading

Ensure internet connection on first launch
Check available storage (~700 MB required)
Models download automatically, progress shown in status bar

Camera not working

Grant camera permissions when prompted
On macOS, app runs in text-only mode (no camera support)
Physical device recommended for full demo

Offline mode not working

Tap FAB → Offline Test preset to simulate
Or use device airplane mode
Local models must be downloaded first
Tap offline banner to disable simulation

Metrics not updating

Tap FAB → Reset Metrics to clear
Check debug console for detailed logs
Verify MetricsService initialization

Cloud escalation failing

Verify Gemini API key in .env file
Check internet connection
Review debug console for error details
API uses Gemini 2.5 Flash model

Technical Documentation

For detailed technical information, including:

Complete development history
All challenges encountered and solutions
Architecture decisions and rationale
Performance characteristics
Future optimization plans

See docs/technical_documentation.md

Contributing

This project demonstrates hybrid edge-to-cloud AI architecture. Key areas for contribution:

Additional presets for different industries
More sophisticated routing logic
Enhanced error handling and retry mechanisms
Additional performance optimizations
Support for more Cactus models
Multi-turn conversation support
Voice control (TTS/STT)

License

MIT License - See LICENSE file for details

Acknowledgments

Built with:

Cactus Compute - On-device AI inference SDK
Google DeepMind - Gemini API and FunctionGemma model
Flutter - Cross-platform framework
connectivity_plus - Network monitoring
Liquid AI - LFM2-VL vision model

Special thanks to the teams at Google DeepMind, Cactus Compute, and Liquid AI for providing the tools to build hybrid AI systems that work anywhere - from the factory floor to remote field sites.

Demonstrating the future of hybrid edge-to-cloud AI systems

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
android		android
assets		assets
docs		docs
ios		ios
lib		lib
linux		linux
macos		macos
test		test
web		web
windows		windows
.gitignore		.gitignore
.metadata		.metadata
README.md		README.md
analysis_options.yaml		analysis_options.yaml
pubspec.lock		pubspec.lock
pubspec.yaml		pubspec.yaml

ElikplimSunu/tacit_node

Folders and files

Latest commit

History

Repository files navigation