Hawkeye

🦅 The First Proactive AI Assistant for Desktop

AI that enhances your story. Watch keenly. Act thoughtfully. 10x your productivity.

⌘ + ⇧ + H to observe your screen instantly

🌐 Website · 📖 Documentation · 🐛 Report Bug · 💡 Request Feature

🎯 What is Hawkeye?

Traditional AI waits for your commands. Hawkeye watches and helps proactively.

Hawkeye is an AI-powered desktop assistant that observes your work environment—screen, clipboard, files—and proactively offers intelligent suggestions. No prompts needed.

The AI behind Hawkeye is designed to enhance your own story — turning your screen time into meaningful personal growth by automatically mapping your goals, habits, and progress into a living Life Tree.

Feature	Copilot / Cursor / Claude Code	Hawkeye
Mode	Reactive (you ask)	Proactive (it watches)
Scope	Code only	Everything: coding, browsing, writing
Privacy	Cloud-based	Local-first, your data stays local
Control	AI executes	You decide what to execute

✨ Key Features

🔍 Zero-Prompt Intelligence Automatically understands your context No need to explain what you're doing Suggests actions before you ask	🏠 Privacy-First Architecture All perception runs 100% locally Data never leaves your device Works offline with local LLMs
🎯 Smart Task Tracking Identifies your main task goal Generates actionable next steps Learns from your workflow	🔗 Multi-Platform Sync Desktop ↔ Browser seamless sync VS Code extension integration Cross-app workflow automation
🌳 Life Tree — AI Enhances Your Story Automatically maps your activities into life stages, goals, and tasks Proposes micro-experiments to optimize your habits and workflows Graduated experiment phases: task → goal → automation Your AI companion that turns screen time into personal growth

🚀 Quick Start

Download

Platform	Download
	Apple Silicon (.dmg) · Intel (.dmg)
	Installer (.exe)
	Debian/Ubuntu (.deb) · AppImage

⚠️ macOS: "App is damaged" fix

# Remove quarantine attribute
xattr -cr /Applications/Hawkeye.app

Setup in 60 Seconds

# 1. Clone
git clone https://github.com/tensorboy/hawkeye.git && cd hawkeye

# 2. Install
pnpm install

# 3. Run
pnpm dev

Configure AI Provider

Option 1: Google Gemini (Recommended — free tier)

Get a free API key at aistudio.google.com/apikey
Enter your key in Settings → Gemini API Key
Model defaults to gemini-2.0-flash (1M context window)

Option 2: OpenAI-Compatible API

Works with OpenAI, DeepSeek, Groq, Together AI, or any OpenAI-compatible endpoint.

Set your base URL, API key, and model name in Settings.

Option 3: Local LLM with node-llama-cpp (100% Offline)

Download a GGUF model and set the model path in Settings. Supports Metal GPU acceleration on macOS.

Recommended models:

Qwen 2.5 7B — general purpose (4.7 GB)
Llama 3.2 3B — lightweight (2.0 GB)
LLaVA 1.6 7B — vision support (4.5 GB)

Option 4: Ollama (Legacy)

brew install ollama && ollama pull qwen3:8b

Select "Ollama" in Hawkeye settings.

🏗️ Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        HAWKEYE ENGINE                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐         │
│  │  PERCEPTION │───▶│  REASONING  │───▶│  EXECUTION  │         │
│  │   Engine    │    │   Engine    │    │   Engine    │         │
│  └─────────────┘    └─────────────┘    └─────────────┘         │
│        │                  │                  │                  │
│   • Screen OCR      • Claude/Ollama     • Shell Commands       │
│   • Clipboard       • Task Analysis     • File Operations      │
│   • File Watch      • Intent Detect     • App Control          │
│   • Window Track    • Suggestions       • Browser Auto         │
│                                                                 │
├─────────────────────────────────────────────────────────────────┤
│                         INTERFACES                              │
├───────────────┬───────────────┬───────────────┬─────────────────┤
│   🖥️ Desktop   │  🧩 VS Code    │  🌐 Chrome     │    📦 Core      │
│   (Electron)  │  Extension    │  Extension    │    (npm pkg)    │
└───────────────┴───────────────┴───────────────┴─────────────────┘

🔮 Future: Multi-Modal HCI Pipeline

Hawkeye is evolving into a full multi-modal human-computer interaction system that combines audio understanding, visual perception, and gesture control.

┌─────────────────────────────────────────────────────────────────────────────┐
│                    HAWKEYE MULTI-MODAL HCI PIPELINE                          │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                         INPUT LAYER                                  │   │
│   ├─────────────────────────────────────────────────────────────────────┤   │
│   │  📷 Camera ────▶ MediaPipe Holistic                                 │   │
│   │                  • Face: 468 landmarks                              │   │
│   │                  • Pose: 33 keypoints                               │   │
│   │                  • Hands: 21 × 2 keypoints                          │   │
│   │                                                                      │   │
│   │  🎙️ Microphone ─▶ Silero VAD ─▶ Audio Buffer                        │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                              │                │                              │
│                              ▼                ▼                              │
│   ┌──────────────────────────────┐  ┌──────────────────────────────────┐   │
│   │      VISUAL PROCESSING       │  │      AUDIO PROCESSING            │   │
│   ├──────────────────────────────┤  ├──────────────────────────────────┤   │
│   │  Face Tracker                │  │  DiariZen / Pyannote             │   │
│   │  ├─ Multi-face detection     │  │  ├─ Speaker diarization          │   │
│   │  ├─ Face ID assignment       │  │  ├─ "Who is speaking?"           │   │
│   │  └─ Lip movement analysis    │  │  └─ Speaker embeddings           │   │
│   │                              │  │                                   │   │
│   │  Gesture Recognizer          │  │  Whisper (smart-whisper)         │   │
│   │  ├─ Hand pose classification │  │  ├─ Speech-to-text               │   │
│   │  ├─ Dynamic gesture detect   │  │  ├─ Language detection           │   │
│   │  └─ Custom gesture mapping   │  │  └─ Timestamp alignment          │   │
│   └──────────────────────────────┘  └──────────────────────────────────┘   │
│                              │                │                              │
│                              ▼                ▼                              │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                    FUSION & MATCHING LAYER                           │   │
│   ├─────────────────────────────────────────────────────────────────────┤   │
│   │                                                                      │   │
│   │   Audio-Visual Matching                                             │   │
│   │   ├─ Lip-sync correlation (who's lips match the audio?)            │   │
│   │   ├─ Face-voice association (learn speaker identity)               │   │
│   │   └─ Active speaker detection (LoCoNet / AS-Net)                   │   │
│   │                                                                      │   │
│   │   Context Aggregation                                               │   │
│   │   ├─ Combine: transcription + speaker ID + face ID + gesture       │   │
│   │   └─ Generate unified interaction events                           │   │
│   │                                                                      │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                                      │                                       │
│                                      ▼                                       │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                       ACTION EXECUTION                               │   │
│   ├─────────────────────────────────────────────────────────────────────┤   │
│   │                                                                      │   │
│   │   Gesture → Command Mapping                                         │   │
│   │   ├─ 👍 Thumbs Up     → Confirm action                             │   │
│   │   ├─ ✋ Open Palm     → Pause / Stop                                │   │
│   │   ├─ 👆 Point Up      → Scroll up                                   │   │
│   │   ├─ 👇 Point Down    → Scroll down                                 │   │
│   │   ├─ ✌️ Victory       → Screenshot                                  │   │
│   │   ├─ 🤏 Pinch        → Zoom in/out                                  │   │
│   │   └─ 🖐️ Swipe        → Switch window / tab                         │   │
│   │                                                                      │   │
│   │   Voice Command + Gesture = Enhanced Control                        │   │
│   │   └─ "Open browser" + Point → Open browser at pointed location     │   │
│   │                                                                      │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                                      │                                       │
│                                      ▼                                       │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                         OUTPUT                                       │   │
│   ├─────────────────────────────────────────────────────────────────────┤   │
│   │                                                                      │   │
│   │   📝 Attributed Transcription                                       │   │
│   │      "Alice: Let's review the code changes"                         │   │
│   │      "Bob: I'll share my screen [👆 pointing at screen]"            │   │
│   │                                                                      │   │
│   │   🎮 System Control                                                 │   │
│   │      Mouse movement, clicks, keyboard shortcuts, app switching      │   │
│   │                                                                      │   │
│   │   🌳 Life Tree Update                                               │   │
│   │      Activity tracking, goal inference, habit analysis              │   │
│   │                                                                      │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Key Technologies:

Component	Technology	Status
Voice Activity Detection	Silero VAD	✅ Planned
Speech-to-Text	Whisper (smart-whisper)	✅ Implemented
Speaker Diarization	DiariZen / Pyannote	🔄 Research
Active Speaker Detection	LoCoNet (CVPR 2024)	🔄 Research
Body Tracking	MediaPipe Holistic	✅ Planned
Gesture Recognition	MediaPipe Gesture	✅ Planned
Face-Voice Matching	Custom Fusion	🔄 Research

📦 Project Structure

hawkeye/
├── packages/
│   ├── core/                 # 🧠 Core engine (local processing)
│   │   ├── perception/       #    Screen, clipboard, file monitoring
│   │   ├── ai/               #    AI providers (Claude, Ollama, etc.)
│   │   ├── execution/        #    Action execution system
│   │   └── storage/          #    Local database (SQLite)
│   │
│   ├── desktop/              # 🖥️  Electron desktop app
│   ├── vscode-extension/     # 🧩 VS Code extension
│   └── chrome-extension/     # 🌐 Chrome browser extension
│
├── docs/                     # 📖 Documentation
└── website/                  # 🌐 Marketing site

🔒 Privacy & Security

Aspect	How We Protect You
Screenshots	✅ Analyzed locally, never uploaded
Clipboard	✅ Processed on-device only
Files	✅ Monitored locally, paths never sent
AI Calls	✅ Only minimal context text sent (or use local LLM)
Dangerous Ops	✅ Always requires your confirmation

📁 All data stored in ~/.hawkeye/ — you own your data.

📖 Usage Examples

As a Library

import { HawkeyeEngine } from '@hawkeye/core';

const engine = new HawkeyeEngine({
  provider: 'ollama',
  model: 'qwen3:8b'
});

// Get AI-powered suggestions based on current context
const suggestions = await engine.observe();

// Execute a suggestion with user confirmation
await engine.execute(suggestions[0].id);

File Watcher

import { FileWatcher } from '@hawkeye/core';

const watcher = new FileWatcher({
  paths: ['~/Downloads', '~/Documents'],
  events: ['create', 'move']
});

watcher.on('change', (event) => {
  console.log(`${event.type}: ${event.path}`);
});

🛡️ Advanced Features

Exponential Backoff Retry

AI provider calls use exponential backoff with jitter to handle transient failures gracefully, preventing thundering herd effects.

SQLite FTS5 Full-Text Search

Context history (window titles, clipboard, OCR text) is indexed with SQLite FTS5 for instant fuzzy search across all recorded observations.

Adaptive Refresh Rate

The observation interval adjusts dynamically based on user activity — fast polling when active, slow polling when idle — saving CPU and battery.

Priority Task Queue

A priority-based task queue with deduplication ensures that AI requests and plan executions are processed efficiently without duplicate work.

MCP Server Tools

Hawkeye exposes 15+ tools via MCP (Model Context Protocol) for screen perception, window management, file organization, and automation.

Safety Guardrails

An agent monitor enforces cost limits, blocks dangerous operations (e.g. rm -rf /), requires confirmation for risky actions, and supports a sandbox mode.

Menu Bar Panel

A macOS-style popover panel accessible from the system tray provides quick actions, recent activity feed, and real-time module status indicators.

Provider Unified Protocol

All AI providers declare their capabilities (chat, vision, streaming, function calling), enabling intelligent routing and health monitoring across providers.

🗺️ Roadmap

🤝 Contributing

Contributions are what make the open source community amazing! Any contributions you make are greatly appreciated.

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

See CONTRIBUTING.md for detailed guidelines.

⭐ Star History

📄 License

Distributed under the MIT License. See LICENSE for more information.

☕ Support

If you find Hawkeye useful, consider buying me a coffee!

🌐 Website · 📖 Docs · 🐦 Twitter · 💬 Discord

_{Built with ❤️ by the Hawkeye Team}

If Hawkeye helps you, please consider giving it a ⭐

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
.github/workflows		.github/workflows
.playwright-mcp		.playwright-mcp
.sisyphus		.sisyphus
assets		assets
docs		docs
packages		packages
.gitignore		.gitignore
.npmrc		.npmrc
FUTURES.md		FUTURES.md
HEADLESS.md		HEADLESS.md
HEADLESS_PLAN.md		HEADLESS_PLAN.md
IMPLEMENTATION_PLAN.md		IMPLEMENTATION_PLAN.md
INTEGRATION_ADDENDUM.md		INTEGRATION_ADDENDUM.md
INTEGRATION_RECOMMENDATIONS.md		INTEGRATION_RECOMMENDATIONS.md
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md
RELEVANT_WORKS.md		RELEVANT_WORKS.md
generate_clean_logo.py		generate_clean_logo.py
generate_logo.py		generate_logo.py
generate_soul_bright_fixed_crop.py		generate_soul_bright_fixed_crop.py
generate_soul_eye_logo.py		generate_soul_eye_logo.py
generate_soul_logo.py		generate_soul_logo.py
generate_variants.py		generate_variants.py
logo.png		logo.png
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
test-direct-spawn.js		test-direct-spawn.js
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

Hawkeye

🦅 The First Proactive AI Assistant for Desktop

🎯 What is Hawkeye?

✨ Key Features

🔍 Zero-Prompt Intelligence

🏠 Privacy-First Architecture

🎯 Smart Task Tracking

🔗 Multi-Platform Sync

🌳 Life Tree — AI Enhances Your Story

🚀 Quick Start

Download

Setup in 60 Seconds

Configure AI Provider

🏗️ Architecture

🔮 Future: Multi-Modal HCI Pipeline

📦 Project Structure

🔒 Privacy & Security

📖 Usage Examples

As a Library

File Watcher

🛡️ Advanced Features

Exponential Backoff Retry

SQLite FTS5 Full-Text Search

Adaptive Refresh Rate

Priority Task Queue

MCP Server Tools

Safety Guardrails

Menu Bar Panel

Provider Unified Protocol

🗺️ Roadmap

🤝 Contributing

⭐ Star History

📄 License

☕ Support

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages