These rules MUST be followed for every feature, bugfix, and code change. They are non-negotiable and are critical for project health and AI-assist reliability.
-
Documentation & Tracking
plan.md,architectureCODER_X_maintenance.md,requirements.txt, and all other project, bug tracking, and feature tracking files MUST be kept up to date with every significant change or success.- When a feature or bug is completed, immediately update all relevant documentation and remove obsolete/completed issues from tracking files.
- Never leave documentation or tracking out of sync with the actual codebase.
-
Testing Discipline
- Every new feature, bugfix, or refactor MUST include at least one relevant unit test.
- Unit tests must always exercise real, production code paths (not just mocks or stubs).
- After any change, run all unit tests to ensure nothing is broken.
- Do not proceed to the next step until all tests are passing at 100%.
-
Commit & Push Policy
- Once all tests are passing after a change, immediately commit all changes locally and push to the remote repository.
- Never leave uncommitted changes or failing tests in the working directory.
Goal:
Develop a Python-based, agentic coding assistant modeled after Anthropic's Claude Code. This tool will provide natural language coding assistance, codebase understanding, and workflow automation directly from the terminal. It will support both local and remote models (e.g., Ollama), allow user selection and configuration of models, support the Model Context Protocol (MCP), and enable flexible storage of models on any user-selected drive.
Approach:
- Use Python with CLI for the backend.
- Implement a CLI and interactive shell for user interaction.
- Support model selection (local/remote) and configuration.
- Integrate with MCP for context/memory sharing.
- Allow secure key management.
- Enable model storage on any user-selected location (internal or external drives).
- Modular, extensible, and well-documented codebase.
- Coder-X now uses
app/key_encryption.pyfor secure CLI/API key storage. - All secrets are encrypted using PBKDF2HMAC key derivation and Fernet (AES) encryption.
- Encrypted secrets are stored in
~/.coder_x_key.enc. - If the passphrase is lost, keys cannot be recovered (must re-create).
- Comprehensive unit tests cover encryption, decryption, file handling, and error cases.
Current Status:
- Coder-X implements the core features of Claude Code: interactive CLI, one-off queries, scriptable output, unified configuration, model selection, session/history management, and secure key management.
- Documentation, onboarding, and test coverage are strong.
Notable Parity Achievements:
- Interactive REPL and scripting
- Unified config management (Pydantic, Typer)
- Model selection/configuration (Ollama/local/remote)
- Session/history and secure key management
- Modular, extensible subsystems
- Project initialization and onboarding docs
Partial or Missing Features (Compared to Claude Code):
- Slash command system (e.g., /clear, /doctor, /help, etc.)
- Full Model Context Protocol (MCP) integration
- Usage/cost tracking and reporting
- Health check/doctor command
- Bug reporting from CLI
- Project initialization as a command (/init)
- Memory file editing via CLI
- PR/comments/code review integration
- Terminal/editor integration (key bindings, vim mode)
- Auto-update mechanism
To match the full feature set and experience of Claude Code, prioritize:
- Implementing a robust slash command system for in-session runtime actions.
- Completing MCP integration for project/user/global memory sharing.
- Adding usage/cost tracking and reporting.
- Introducing a health check/doctor CLI command.
- Enabling bug reporting from the CLI.
- Creating a project initialization command (/init) for onboarding.
- Supporting memory file editing through the CLI.
- Integrating with PR/comments and code review tools.
- Adding terminal/editor integration features.
- Providing an update mechanism for Coder-X.
Summary:
Coder-X is well on its way to achieving parity with Anthropic's Claude Code, with all core features implemented and a clear roadmap for closing remaining gaps. This plan ensures Coder-X remains robust, user-friendly, and competitive as an agentic coding assistant.
- CLI and Interactive Shell: Natural language and slash command interface.
- Model Management: List, select, load, and switch models (local/remote).
- Model Storage Location: User can specify any directory for local model storage.
- Key Management: Secure storage, update, and retrieval of keys.
- MCP Protocol Support: Integration with MCP servers for context and memory.
- File Operations: Edit, explain, test, and lint code files.
- Shell Command Integration: Securely run shell commands from within the tool.
- Configuration Management: Load/save config files, support all settings via CLI and shell.
- Session/History Management: Track, display, clear, and export conversations and actions.
- User Management: Show user info, manage authentication.
- Feedback/Telemetry: Collect/send feedback, optionally send telemetry.
- Third-Party Integrations: Connect to VCS, cloud, or other services.
- CLI & Interactive Shell
- Model Management
- Model Storage Location Management
- Key Management
- MCP Integration
- File Operations
- Shell Integration
- Configuration Management
- Session/History Management
- User Management
- Feedback/Telemetry
- Third-Party Integrations
- Once tests pass, clean up the test code and update documentation as needed.
- All tests must exercise real, production code paths (not just mocks or stubs).
- Features must work the same way at runtime and in tests—tests should not bypass or mock out core logic.
- Configuration loading and saving must be unified: the same config file must work identically in both runtime and test contexts.
- If the config loader or CLI/API code falls back to defaults or overwrites fields, this must be refactored so that valid files are always honored as-is.
- Coverage and test quality are prioritized over simply "passing" tests.
All core features (model management, storage, shell integration, config, session/history, user management, MCP, key management, file operations) are now implemented, tested, and documented. Unit tests (see test_cli_new.py and other current tests) cover all validation and error-handling logic. Documentation is up to date in plan.md and architectureCODER_X_maintenance.md. Legacy and redundant tests have been removed for clarity.
- Implement CLI entrypoint (Typer-based)
- Implement interactive shell (prompt_toolkit)
- Add command parsing for CLI and slash commands
- CLI command parsing (Typer)
- Slash command support (via interactive shell)
- Implement help/version/config display
- List available models (local/remote)
- Select model
- Load/unload models dynamically (Ollama backend integration, real CLI commands:
model load/unload, volume selection) - Integrate with Ollama for local models
- Integrate with Anthropic API for Claude models (simulated/stubbed)
- Support user-supplied keys for remote models
- Add
model_storage_pathto config schema - Implement CLI flag (
--model-storage-path <path>) and slash command (/model-storage-path <path>) - Validate user-supplied path (exists, writable, sufficient space)
- Update model loading/saving logic to use storage path
- Handle errors for unavailable/disconnected drives
- Document feature in help/config
Note: Validation for existence, writability, and free space is implemented and tested. Error handling for unavailable/disconnected drives is in place. Further improvements to output/UX are deferred until all basic features are complete.
- Securely prompt for/store keys (unit tests written)
- Encrypt keys at rest (Fernet, unit tests written)
- Retrieve and use keys for model requests (unit tests written)
- Allow updating/removing keys (unit tests written)
- Integrate MCP protocol via official Python library (unit tests written)
- Connect to MCP servers (configurable endpoint, unit tests written)
- Fetch/send context and memory to MCP server (unit tests written)
- Expose MCP settings in config and slash commands (unit tests written)
- Integration test now uses a real public endpoint in a passive, read-only way and skips if unavailable, ensuring robust CI.
- Open, read, write, edit files
- Explain code in a file
- Run tests (pytest/unittest)
- Lint code (flake8/pylint)
- Track file changes/edits in session history
- Run shell commands securely
- Capture/display output/errors
- Block dangerous commands by default; prompt user for approval if such a command is requested (implemented and tested).
Note: Output/UX improvements for shell integration deferred until all basic features are complete.
- Restrict dangerous commands (configurable)
- Load/save config from file (JSON, YAML support planned, unit tests written)
- Refactored config subsystem: now uses Pydantic V2 schema for robust validation and type safety.
- CLI config commands (show, set, unset, setup) implemented as Typer group, all output is structured JSON for scripting and testability.
- Robust config loading: handles empty/invalid files, always returns a valid config object.
- All config commands and edge cases are covered by new tests in tests/test_config_cli.py.
- Documentation and architecture updated to reflect new design.
- Store conversation/command history
- Allow user to view, clear, export history
- Display current user info
- Support login/logout for remote APIs
- UserManager is now a singleton for correct session state persistence (login/logout tests pass)
- All telemetry and feedback code and tests fully removed as required by user plan (no data is collected or sent)
- Connect to external services (VCS, cloud, etc.)
- Expose via
/connectcommand
- Use Ollama for local models; fallback to
llama-cpp-pythonortransformersif needed. - Secure, encrypted key storage.
- Model storage location is user-configurable and can be any attached drive.
- All config changes available via CLI, shell, and config file.
- Modular code for easy extension and maintenance.
- Every implementation step must include at least one unit test.
- After each step and its associated test(s) are implemented, all unit tests must be run to:
- Verify the new functionality is correct.
- Ensure no existing functionality is broken (non-regression).
- The project will not proceed to the next step until all tests pass.
- Before running any test that downloads or uses a local model, always prompt the user to select which volume to use for model storage to avoid disk space issues or using the wrong drive.
- Defer the implementation and testing of local model downloading/storage until after remote/free/no-API-key model functionality is implemented and tested successfully.
- For each section and step (and its unit test(s)), an accurate section must be added to architectureCODER_X_maintenance.md describing how the step is accomplished. This document must be updated at the same rate as steps are implemented and tested.
- All code stubs (placeholders, stub functions/classes, and TODOs) must be fully implemented and tested before the project is considered complete. Stub removal is a tracked deliverable.
- CLI entrypoint (Typer-based, Coder-X branding, unit tests written)
- Telemetry/feedback code and tests fully removed per user requirement
- MCP integration test made robust, passive, and read-only for public endpoint
- User management tests fixed by enforcing singleton state for UserManager
- Interactive shell (migrating to new CLI structure, unit tests pending)
- Model management
- Key management
- Created
app/key_management.pyfor secure key storage, retrieval, and removal (with placeholder encryption). - Stub removal required: All placeholder encryption and stubbed methods must be fully implemented and tested before completion.
- Created
- MCP integration
- Created
app/mcp_integration.pyfor connecting to MCP servers, fetching and saving context/memory. - Stub removal required: All stubbed methods must be fully implemented and tested before completion.
- Created
- File operations
- Created
app/file_operations.pyfor reading, writing, appending, explaining, testing, and linting files (with stubs for model integration). - Stub removal required: All stubs for model integration must be implemented and tested before completion.
- Created
- Shell integration
- Created
app/shell_integration.pyfor running shell commands securely, with a configurable allowlist of safe commands.
- Created
- Config management (CLI editing and persistence, YAML/JSON selection, unit tests pending)
- History/session management
- Created
app/session_history.pyfor storing, viewing, clearing, and exporting conversation and command history.
- Created
- User management
- Created
app/user_management.pyfor displaying user info and stubs for login/logout (for remote APIs or future expansion). - Stub removal required: Login/logout stubs must be fully implemented and tested before completion.
- Created
- Feedback/telemetry
- Created
app/feedback_telemetry.pyfor collecting user feedback and stubs for telemetry (opt-in, to be implemented). - Stub removal required: Telemetry stubs must be fully implemented and tested before completion.
- Created
- Third-party integrations
- Created
app/third_party_integrations.pyfor connecting to external services (VCS, cloud, etc.), with connect/disconnect/list stubs. - Stub removal required: All stubbed methods must be fully implemented and tested before completion.
- Created
- All integration/system tests (including those that generate, write, or execute code) must run in a temporary, isolated directory. Never run such tests in the user's real project, codebase, or home folder.
- This prevents accidental file creation, overwriting, or clutter in real user directories and is required for safety and reproducibility.
- Integration tests should always clean up after themselves and verify no files are left behind.
- Add 'rich' as a dependency for console UI.
- Refactor CLI to render all input/output inside a persistent frame using rich.
- Refactor interactive shell to use rich framing for all prompts and outputs.
- Add integration/system tests to ensure all shell/CLI output and input is always inside a frame.
- Document the framing behavior and test requirements in architectureCODER_X_maintenance.md.
- Mark as required for all future features and output improvements.
- Implement and test telemetry (remove stubs in
app/feedback_telemetry.py) - Implement and test third-party integrations (remove stubs in
app/third_party_integrations.py) - Final audit: Remove all stubs, placeholders, and TODOs; ensure 100% implementation and test coverage
- Plan and begin next-phase improvements (output/UX, new features, etc.)