Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
116 changes: 114 additions & 2 deletions codebase_rag/mcp/tools.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,26 +4,37 @@
"""

import itertools
import sys
from collections.abc import Callable
from dataclasses import dataclass
from pathlib import Path
from typing import Any, cast

from loguru import logger
from rich.console import Console

from codebase_rag.graph_updater import GraphUpdater
from codebase_rag.parser_loader import load_parsers
from codebase_rag.services.graph_service import MemgraphIngestor
from codebase_rag.services.llm import CypherGenerator
from codebase_rag.services.llm import CypherGenerator, create_rag_orchestrator
from codebase_rag.tools.code_retrieval import CodeRetriever, create_code_retrieval_tool
from codebase_rag.tools.codebase_query import create_query_tool
from codebase_rag.tools.directory_lister import (
DirectoryLister,
create_directory_lister_tool,
)
from codebase_rag.tools.document_analyzer import (
DocumentAnalyzer,
create_document_analyzer_tool,
)
from codebase_rag.tools.file_editor import FileEditor, create_file_editor_tool
from codebase_rag.tools.file_reader import FileReader, create_file_reader_tool
from codebase_rag.tools.file_writer import FileWriter, create_file_writer_tool
from codebase_rag.tools.semantic_search import (
create_get_function_source_tool,
create_semantic_search_tool,
)
from codebase_rag.tools.shell_command import ShellCommander, create_shell_command_tool


@dataclass
Expand Down Expand Up @@ -66,10 +77,14 @@ def __init__(
self.file_reader = FileReader(project_root=project_root)
self.file_writer = FileWriter(project_root=project_root)
self.directory_lister = DirectoryLister(project_root=project_root)
self.shell_commander = ShellCommander(project_root=project_root)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The ShellCommander tool (initialized here) allows for arbitrary file access. While it checks if the command name is in an allowlist (e.g., cat, ls), it does not validate or sanitize the arguments. An attacker can provide absolute paths as arguments to read or list files outside the project root (e.g., execute_shell_command(command="cat /etc/passwd")).

self.document_analyzer = DocumentAnalyzer(project_root=project_root)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The DocumentAnalyzer tool (initialized here) contains a path traversal vulnerability in its analyze method. It explicitly handles absolute paths by copying the target file to a .tmp directory without verifying if the path is within the project root. This allows an attacker to read any file on the system that the process has access to by providing an absolute path.


# Create pydantic-ai tools - we'll call the underlying functions directly
# Use a Console that outputs to stderr to avoid corrupting JSONRPC on stdout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline comments violate the comment policy

The project's comment policy only permits inline comments that are: (1) top-of-file, (2) prefixed with (H), or (3) type annotations. None of the newly added inline comments qualify.

This applies to every inline comment introduced in this PR:

  • codebase_rag/mcp/tools.py:84# Use a Console that outputs to stderr to avoid corrupting JSONRPC on stdout
  • codebase_rag/mcp/tools.py:105# Create RAG orchestrator agent (lazy initialization for testing)
  • codebase_rag/mcp/tools.py:534# Suppress all logging output during agent execution
  • codebase_rag/mcp/tools.py:536# Temporarily redirect stdout and stderr to suppress all output
  • codebase_rag/mcp/tools.py:538# Temporarily disable loguru logging
  • codebase_rag/mcp/tools.py:541# Run the query using the RAG agent
  • codebase_rag/mcp/tools.py:544# Re-enable logging
  • codebase_rag/mcp/tools.py:547# Fail silently without logging or printing error details

Remove or prefix all of these with (H) if the intent is genuinely human-authored explanation.

Context Used: Rule from dashboard - ## Technical Requirements

Agentic Framework

  • PydanticAI Only: This project uses PydanticAI... (source)
Prompt To Fix With AI
This is a comment left during a code review.
Path: codebase_rag/mcp/tools.py
Line: 84

Comment:
**Inline comments violate the comment policy**

The project's comment policy only permits inline comments that are: (1) top-of-file, (2) prefixed with `(H)`, or (3) type annotations. None of the newly added inline comments qualify.

This applies to every inline comment introduced in this PR:
- `codebase_rag/mcp/tools.py:84``# Use a Console that outputs to stderr to avoid corrupting JSONRPC on stdout`
- `codebase_rag/mcp/tools.py:105``# Create RAG orchestrator agent (lazy initialization for testing)`
- `codebase_rag/mcp/tools.py:534``# Suppress all logging output during agent execution`
- `codebase_rag/mcp/tools.py:536``# Temporarily redirect stdout and stderr to suppress all output`
- `codebase_rag/mcp/tools.py:538``# Temporarily disable loguru logging`
- `codebase_rag/mcp/tools.py:541``# Run the query using the RAG agent`
- `codebase_rag/mcp/tools.py:544``# Re-enable logging`
- `codebase_rag/mcp/tools.py:547``# Fail silently without logging or printing error details`

Remove or prefix all of these with `(H)` if the intent is genuinely human-authored explanation.

**Context Used:** Rule from `dashboard` - ## Technical Requirements

### Agentic Framework
- **PydanticAI Only**: This project uses PydanticAI... ([source](https://app.greptile.com/review/custom-context?memory=d4240b05-b763-467a-a6bf-94f73e8b6859))

How can I resolve this? If you propose a fix, please make it concise.

stderr_console = Console(file=sys.stderr, width=None, force_terminal=True)
self._query_tool = create_query_tool(
ingestor=ingestor, cypher_gen=cypher_gen, console=None
ingestor=ingestor, cypher_gen=cypher_gen, console=stderr_console
)
self._code_tool = create_code_retrieval_tool(code_retriever=self.code_retriever)
self._file_editor_tool = create_file_editor_tool(file_editor=self.file_editor)
Expand All @@ -78,6 +93,17 @@ def __init__(
self._directory_lister_tool = create_directory_lister_tool(
directory_lister=self.directory_lister
)
self._shell_command_tool = create_shell_command_tool(
shell_commander=self.shell_commander
)
self._document_analyzer_tool = create_document_analyzer_tool(
self.document_analyzer
)
self._semantic_search_tool = create_semantic_search_tool()
self._function_source_tool = create_get_function_source_tool()

# Create RAG orchestrator agent (lazy initialization for testing)
self._rag_agent: Any = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Agent type instead of Any

self._rag_agent: Any = None loses all type information. create_rag_orchestrator returns Agent (from pydantic_ai), so the field and the property should be typed accordingly:

Suggested change
self._rag_agent: Any = None
self._rag_agent: Agent | None = None

The import from pydantic_ai import Agent should be added at the top of the file, and the rag_agent property signature updated to -> Agent: and the setter to value: Agent | None.

This also applies to the property and setter signatures on lines 267 and 290.

Context Used: Rule from dashboard - ## Technical Requirements

Agentic Framework

  • PydanticAI Only: This project uses PydanticAI... (source)
Prompt To Fix With AI
This is a comment left during a code review.
Path: codebase_rag/mcp/tools.py
Line: 106

Comment:
**Use `Agent` type instead of `Any`**

`self._rag_agent: Any = None` loses all type information. `create_rag_orchestrator` returns `Agent` (from `pydantic_ai`), so the field and the property should be typed accordingly:

```suggestion
        self._rag_agent: Agent | None = None
```

The import `from pydantic_ai import Agent` should be added at the top of the file, and the `rag_agent` property signature updated to `-> Agent:` and the setter to `value: Agent | None`.

This also applies to the property and setter signatures on lines 267 and 290.

**Context Used:** Rule from `dashboard` - ## Technical Requirements

### Agentic Framework
- **PydanticAI Only**: This project uses PydanticAI... ([source](https://app.greptile.com/review/custom-context?memory=d4240b05-b763-467a-a6bf-94f73e8b6859))

How can I resolve this? If you propose a fix, please make it concise.


# Build tool registry - single source of truth for all tool metadata
self._tools: dict[str, ToolMetadata] = {
Expand Down Expand Up @@ -214,8 +240,57 @@ def __init__(
handler=self.list_directory,
returns_json=False,
),
"ask_agent": ToolMetadata(
name="ask_agent",
description="Ask the Code Graph RAG agent a question about the codebase. "
"Use this tool for general questions about the codebase, architecture, functionality, and code relationships. "
"Examples: 'How is the authentication implemented?', "
"'What are the main components of the system?', 'Where is the database connection configured?'",
input_schema={
"type": "object",
"properties": {
"question": {
"type": "string",
"description": "A question about the codebase, architecture, functionality, and code relationships. "
"Examples: 'What functions call UserService.create_user?', "
"'How is error handling implemented?', 'What are the main entry points?'",
}
},
"required": ["question"],
},
handler=self.ask_agent,
returns_json=True,
),
}

@property
def rag_agent(self) -> Any:
"""Lazy-initialize the RAG orchestrator agent on first access.

This allows tests to mock the agent without triggering LLM initialization.
"""
if self._rag_agent is None:
self._rag_agent = create_rag_orchestrator(
tools=[
self._query_tool,
self._code_tool,
self._file_reader_tool,
self._file_writer_tool,
self._file_editor_tool,
self._shell_command_tool,
self._directory_lister_tool,
self._document_analyzer_tool,
self._semantic_search_tool,
self._function_source_tool,
]
Comment on lines +274 to +285
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The ask_agent tool uses a nested RAG orchestrator agent that is configured with powerful, state-changing tools like execute_shell_command, write_file, and replace_code_surgically. While ShellCommander has a confirmation mechanism, ask_agent is non-interactive and suppresses logging. A malicious user can craft a question that tricks the nested agent into executing these tools by autonomously setting user_confirmed=True, effectively bypassing intended security controls without the actual user's oversight. It is recommended to restrict the rag_agent to read-only tools for non-interactive execution.

                tools=[
                    self._query_tool,
                    self._code_tool,
                    self._file_reader_tool,
                    self._directory_lister_tool,
                    self._document_analyzer_tool,
                    self._semantic_search_tool,
                    self._function_source_tool,
                ]
References
  1. For security-sensitive features like shell command execution by an LLM agent, prioritize fail-safe behavior. Prefer false positives (blocking a safe command) over false negatives (allowing a dangerous one), especially if the fix for the false positive adds complexity.

)
return self._rag_agent

@rag_agent.setter
def rag_agent(self, value: Any) -> None:
"""Allow setting the RAG agent (useful for testing)."""
self._rag_agent = value

async def index_repository(self) -> str:
"""Parse and ingest the repository into the Memgraph knowledge graph.

Expand Down Expand Up @@ -439,6 +514,43 @@ async def list_directory(self, directory_path: str = ".") -> str:
logger.error(f"[MCP] Error listing directory: {e}")
return f"Error: {str(e)}"

async def ask_agent(self, question: str) -> dict[str, Any]:
"""Ask a single question about the codebase and get an answer.

This tool executes the question using the RAG agent and returns the response
in a structured format suitable for MCP clients.

Logging is suppressed during execution to prevent token waste in LLM context.

Args:
question: The question to ask about the codebase

Returns:
Dictionary with 'output' key containing the answer
"""
import io
from contextlib import redirect_stderr, redirect_stdout
Comment on lines +531 to +532
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move imports to module level

import io and from contextlib import redirect_stderr, redirect_stdout are placed inside the function body. Per Python convention (and PEP 8), imports belong at the top of the module. Move both to the top-level import block alongside the other stdlib imports.

Suggested change
import io
from contextlib import redirect_stderr, redirect_stdout
# Suppress all logging output during agent execution
try:

The two import lines can be removed from here once they are moved to the module-level import section.

Context Used: Rule from dashboard - ## Technical Requirements

Agentic Framework

  • PydanticAI Only: This project uses PydanticAI... (source)
Prompt To Fix With AI
This is a comment left during a code review.
Path: codebase_rag/mcp/tools.py
Line: 531-532

Comment:
**Move imports to module level**

`import io` and `from contextlib import redirect_stderr, redirect_stdout` are placed inside the function body. Per Python convention (and PEP 8), imports belong at the top of the module. Move both to the top-level import block alongside the other stdlib imports.

```suggestion
        # Suppress all logging output during agent execution
        try:
```

The two import lines can be removed from here once they are moved to the module-level import section.

**Context Used:** Rule from `dashboard` - ## Technical Requirements

### Agentic Framework
- **PydanticAI Only**: This project uses PydanticAI... ([source](https://app.greptile.com/review/custom-context?memory=d4240b05-b763-467a-a6bf-94f73e8b6859))

How can I resolve this? If you propose a fix, please make it concise.


# Suppress all logging output during agent execution
try:
# Temporarily redirect stdout and stderr to suppress all output
with redirect_stdout(io.StringIO()), redirect_stderr(io.StringIO()):
# Temporarily disable loguru logging
logger.disable("codebase_rag")
try:
# Run the query using the RAG agent
response = await self.rag_agent.run(question, message_history=[])
return {"output": response.output}
finally:
# Re-enable logging
logger.enable("codebase_rag")
Comment on lines +537 to +546
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Global logger suppression is not async-safe

logger.disable("codebase_rag") modifies loguru's global state, which is shared across all coroutines. In a concurrent async server, if two ask_agent calls are in-flight simultaneously, coroutine A's logger.disable(...) will suppress logging for coroutine B mid-execution, and coroutine B's logger.enable(...) in its finally block will re-enable logging while coroutine A still expects it to be disabled.

The same applies to redirect_stdout(io.StringIO()) and redirect_stderr(io.StringIO()): these patch the global sys.stdout/sys.stderr, which are shared across all concurrently running coroutines. Output from unrelated coroutines running while these redirects are active will be silently dropped into the StringIO buffer.

A per-handler loguru sink approach or structlog's contextvars-based filtering would be safe for async; the global mutable approach used here is not.

Prompt To Fix With AI
This is a comment left during a code review.
Path: codebase_rag/mcp/tools.py
Line: 537-546

Comment:
**Global logger suppression is not async-safe**

`logger.disable("codebase_rag")` modifies loguru's global state, which is shared across all coroutines. In a concurrent async server, if two `ask_agent` calls are in-flight simultaneously, coroutine A's `logger.disable(...)` will suppress logging for coroutine B mid-execution, and coroutine B's `logger.enable(...)` in its `finally` block will re-enable logging while coroutine A still expects it to be disabled.

The same applies to `redirect_stdout(io.StringIO())` and `redirect_stderr(io.StringIO())`: these patch the global `sys.stdout`/`sys.stderr`, which are shared across all concurrently running coroutines. Output from unrelated coroutines running while these redirects are active will be silently dropped into the `StringIO` buffer.

A per-handler loguru sink approach or structlog's contextvars-based filtering would be safe for async; the global mutable approach used here is not.

How can I resolve this? If you propose a fix, please make it concise.

except Exception:
# Fail silently without logging or printing error details
return {
"output": "There was an error processing your question",
"error": True,
}
Comment on lines +547 to +552
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Errors swallowed silently after logging is re-enabled

By the time the outer except Exception: block is reached, the finally clause on line 544 has already re-enabled logging via logger.enable("codebase_rag"). The exception is therefore catchable and loggable, but the block discards it entirely — returning only a generic "There was an error processing your question" string with no trace of what went wrong.

This makes production debugging extremely difficult. The exception should at minimum be logged at logger.error(...) before returning the fallback response:

except Exception as e:
    logger.error(f"[MCP] ask_agent failed: {e}", exc_info=True)
    return {
        "output": "There was an error processing your question",
        "error": True,
    }
Prompt To Fix With AI
This is a comment left during a code review.
Path: codebase_rag/mcp/tools.py
Line: 547-552

Comment:
**Errors swallowed silently after logging is re-enabled**

By the time the outer `except Exception:` block is reached, the `finally` clause on line 544 has already re-enabled logging via `logger.enable("codebase_rag")`. The exception is therefore catchable and loggable, but the block discards it entirely — returning only a generic `"There was an error processing your question"` string with no trace of what went wrong.

This makes production debugging extremely difficult. The exception should at minimum be logged at `logger.error(...)` before returning the fallback response:

```python
except Exception as e:
    logger.error(f"[MCP] ask_agent failed: {e}", exc_info=True)
    return {
        "output": "There was an error processing your question",
        "error": True,
    }
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +547 to +552
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

While it's important to return a generic error to the client to avoid leaking implementation details or corrupting the JSON-RPC stream, swallowing the exception without logging it on the server side makes debugging failures nearly impossible. The try...finally block for logger.enable ensures that logging is re-enabled before this except block is reached. Therefore, the exception should be logged here to aid maintainability and troubleshooting. Additionally, if a broad Exception is caught due to unpredictable errors from external libraries, a comment explaining this rationale should be added for maintainability, as per repository guidelines.

        except Exception as e:
            # Log the error for debugging, but return a generic message to the client
            # Catching a broad Exception here is pragmatic due to the wide and unpredictable
            # variety of exceptions that can be raised by external libraries.
            logger.error(f"[MCP] Error in ask_agent: {e}", exc_info=True)
            return {
                "output": "There was an error processing your question",
                "error": True,
            }
References
  1. When handling errors from external libraries that may raise multiple types of exceptions for configuration issues (e.g., ValueError for invalid formats and AssertionError for missing keys), catch a tuple of these specific exceptions, such as (ValueError, AssertionError), instead of a generic Exception.
  2. When interacting with external libraries (like pydantic-ai) that can raise non-standard exceptions (e.g., AssertionError for configuration issues), it may be necessary to catch a broader Exception or a tuple of specific exceptions (ValueError, AssertionError) to ensure all user-facing errors are handled gracefully.
  3. If catching a broad Exception is a pragmatic choice due to a wide and unpredictable variety of exceptions from external libraries, add a comment explaining the rationale to improve code maintainability.


def get_tool_schemas(self) -> list[dict[str, Any]]:
"""Get MCP tool schemas for all registered tools.

Expand Down