A clean, modular monolith desktop AI assistant with voice input, chat interface, and powerful tools.
-
Install dependencies:
setup.bat # Windows pip install -r requirements.txt # Or manually
-
Run the agent:
run.bat # Windows python run.py # Or directly
This is a modular monolith - a single-process application with clean module boundaries:
src/
├── app.py # Main application entry point
├── config/ # Configuration management
│ ├── settings.py # YAML-based settings with Pydantic
│ ├── agent_config.py # Agent API parameters
│ └── prompts.py # System prompts
├── core/ # Core agent logic
│ └── agent.py # AI agent with streaming & tools
├── storage/ # Data persistence
│ ├── secure.py # Encrypted storage (keyring)
│ ├── chat_history.py # Conversation persistence
│ └── memory.py # User memories
├── tools/ # Agent capabilities
│ ├── memory.py # Memory management
│ ├── todos.py # Task management
│ ├── filesystem.py # File operations
│ ├── terminal.py # Command execution
│ └── ... # More tools
├── services/ # In-process services
│ ├── transcribe.py # Voice-to-text
│ └── tts.py # Text-to-speech
└── ui/ # PyQt6 interface
├── widget.py # Floating widget
└── components/ # Reusable UI parts
├── chat_window.py # Chat interface
├── settings_window.py # Settings dialog
├── multiline_input.py # Text input widget
├── screenshot_selector.py # Screenshot tool
└── chat_history_json_window.py # History viewer
- Long-press (1s) to record
- Auto-transcribe using OpenAI Whisper
- Multi-language support (en, ro, ru, de, fr, es)
- Type or speak your messages
- Real-time streaming responses
- File drag-and-drop attachment
- Screenshot sharing (up to 5)
- Syntax-highlighted code blocks with Pygments
- Encrypted, persistent history
- Token usage tracking
- Stop generation at any time
- Copy, edit, or delete user messages
- Memory: Remember user preferences
- Todos: Task management
- Files: Read, write, search, edit
- Terminal: Run commands
- Documents: Create Word files
- Charts: Generate visualizations
- Web: Search and browse
- Images: AI image generation
-
Set your API key:
- Right-click widget → Settings
- Enter API token and base URL
- Saved securely in OS keyring
-
Customize: Edit
config.yamlandprompts/system_prompt.md
# config.yaml
agent_name: Djasha
api:
base_url: 'https://api.openai.com/v1'
agent:
model_name: gpt-5.1
system_prompt_path: prompts/system_prompt.md
ui:
theme: dark
tools:
enabled_tools: [memory, todos, filesystem, ...]src/- All application codeconfig.yaml- User configurationrequirements.txt- Python dependencies
- Create a new file in
src/tools/ - Define a class with
schemaproperty andrun()method - Register in
src/tools/__init__.py - Add to
get_default_tools()function
- Components are in
src/ui/components/ - Main widget in
src/ui/widget.py - Styles are inline using PyQt6 stylesheets
- Chat history and memories are stored encrypted at:
%APPDATA%/ai-agent-desktop/chat_history.enc%APPDATA%/ai-agent-desktop/memories.enc
- Encryption key (
data_key) is stored in Windows Credential Manager:- Service:
ai-agent-desktop/data_key, Username:data_key
- Service:
- API Token saved from Settings is stored under:
- Service:
ai-agent-desktop/api_token, Username:api_token
- Service:
- See
secure_storage/README.mdfor details.
Contributions are welcome! Feel free to:
- Report bugs or issues
- Suggest new features
- Submit pull requests
- Improve documentation
This project is licensed under the MIT License - see the LICENSE file for details.
destorted93
- GitHub: @destorted93
- Repository: ai-agent-desktop