-
-
Notifications
You must be signed in to change notification settings - Fork 349
fix: improve Cypher query generation accuracy #294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -26,9 +26,30 @@ def _create_provider_model(config: ModelConfig) -> Model: | |||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| def _clean_cypher_response(response_text: str) -> str: | ||||||||||||||||||||||||||||||||||||||||||
| query = response_text.strip().replace(cs.CYPHER_BACKTICK, "") | ||||||||||||||||||||||||||||||||||||||||||
| if query.startswith(cs.CYPHER_PREFIX): | ||||||||||||||||||||||||||||||||||||||||||
| query = query[len(cs.CYPHER_PREFIX) :].strip() | ||||||||||||||||||||||||||||||||||||||||||
| """Clean LLM response to extract pure Cypher query. | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| Handles markdown formatting that models sometimes output: | ||||||||||||||||||||||||||||||||||||||||||
| - Triple backticks (```cypher ... ```) | ||||||||||||||||||||||||||||||||||||||||||
| - Bold text (**Cypher Query:**) | ||||||||||||||||||||||||||||||||||||||||||
| - Headers and other markdown | ||||||||||||||||||||||||||||||||||||||||||
| """ | ||||||||||||||||||||||||||||||||||||||||||
| import re | ||||||||||||||||||||||||||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Per PEP 8, imports should be at the top of the file. Please remove this import from here and add References
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Move
Suggested change
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time! Prompt To Fix With AIThis is a comment left during a code review.
Path: codebase_rag/services/llm.py
Line: 36:36
Comment:
Move `import re` to top-level imports (after line 1). Module-level imports belong with stdlib imports at the file top, not inside functions.
```suggestion
"""Clean LLM response to extract pure Cypher query.
Handles markdown formatting that models sometimes output:
- Triple backticks (```cypher ... ```)
- Bold text (**Cypher Query:**)
- Headers and other markdown
"""
```
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise. |
||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| query = response_text.strip() | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| # Extract content from code blocks if present (```cypher ... ``` or ``` ... ```) | ||||||||||||||||||||||||||||||||||||||||||
| code_block_match = re.search(r"```(?:cypher)?\s*(.*?)```", query, re.DOTALL | re.IGNORECASE) | ||||||||||||||||||||||||||||||||||||||||||
| if code_block_match: | ||||||||||||||||||||||||||||||||||||||||||
| query = code_block_match.group(1).strip() | ||||||||||||||||||||||||||||||||||||||||||
| else: | ||||||||||||||||||||||||||||||||||||||||||
| # Remove markdown bold/headers (e.g., **Cypher Query:**) | ||||||||||||||||||||||||||||||||||||||||||
| query = re.sub(r"\*\*[^*]+\*\*:?\s*", "", query) | ||||||||||||||||||||||||||||||||||||||||||
| # Remove single backticks | ||||||||||||||||||||||||||||||||||||||||||
| query = query.replace(cs.CYPHER_BACKTICK, "") | ||||||||||||||||||||||||||||||||||||||||||
| # Remove "cypher" prefix if present | ||||||||||||||||||||||||||||||||||||||||||
| if query.lower().startswith(cs.CYPHER_PREFIX): | ||||||||||||||||||||||||||||||||||||||||||
| query = query[len(cs.CYPHER_PREFIX):].strip() | ||||||||||||||||||||||||||||||||||||||||||
|
Comment on lines
+44
to
+51
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The current logic in the
Suggested change
Comment on lines
+50
to
+51
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Case mismatch:
Suggested change
Prompt To Fix With AIThis is a comment left during a code review.
Path: codebase_rag/services/llm.py
Line: 50:51
Comment:
Case mismatch: `query.lower().startswith()` but using original `cs.CYPHER_PREFIX` length. If `cs.CYPHER_PREFIX = "cypher"` and query is `"CYPHER MATCH..."`, slicing by `len("cypher")` (6 chars) from `"CYPHER MATCH..."` works correctly. However, for safety and clarity, use consistent casing.
```suggestion
if query.lower().startswith(cs.CYPHER_PREFIX.lower()):
query = query[len(cs.CYPHER_PREFIX):].strip()
```
How can I resolve this? If you propose a fix, please make it concise. |
||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| if not query.endswith(cs.CYPHER_SEMICOLON): | ||||||||||||||||||||||||||||||||||||||||||
| query += cs.CYPHER_SEMICOLON | ||||||||||||||||||||||||||||||||||||||||||
| return query | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the project's general rules, docstrings are not allowed. Please remove this docstring to adhere to the project's coding standards.
References