You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here are some key observations to aid the review process:
⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 No relevant tests
🔒 Security concerns
Path traversal: Upload and deletion endpoints in graphrag/app/routers/ui.py use user-provided filenames directly in filesystem paths (e.g., os.path.join(upload_dir, file.filename) and os.path.join(upload_dir, filename)). This allows attackers to traverse directories or overwrite arbitrary files. Sanitize via os.path.basename, restrict allowed characters, and reject paths containing path separators.
Potential command injection in GSQL: Endpoints constructing GSQL with f"CREATE GRAPH {graphname}()" accept unvalidated path params. Validate graph names against a strict regex (e.g., ^[A-Za-z_][A-Za-z0-9_]*$) or use APIs that avoid string interpolation.
Sensitive information exposure in logs: Cloud download handlers may log exceptions that include credentials (e.g., gcs_credentials_json parsing errors, SDK ClientError). Ensure logs redact secrets and remove full trace logs containing credentials (logger.debug_pii).
Minor info exposure: init_graph returns host_name which might reveal internal topology. Consider omitting or gating behind debug.
Small image filter currently does nothing (pass) and continues processing, likely unintended. Either remove the check or explicitly skip/return to avoid processing tiny images.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR Type
Enhancement, Documentation, Other
Description
Graph lifecycle and ingestion REST APIs
ECC rebuild tracking and status endpoint
PDF extraction via
pymupdf4llmwithtg://imagesUI setup, uploads, and cloud downloads
Diagram Walkthrough
File Walkthrough
13 files
Add graph APIs, ingestion, uploads, and cloud downloadUse pymupdf4llm; emit `tg://` image referencesPath-based LLM vision; remove legacy saverNew Markdown image parser and replacerTrack rebuild tasks; add rebuild status endpointRename forceupdate handler; add create graph APIConvert `tg://` image links to UI endpointConvert `tg://` image links in responsesAdd Setup button and logout confirmation dialogRefresh graph list on focus/navigationNew reusable confirmation dialog componentNew hook to request user confirmationRegister `/setup` route and layout integration6 files
Update comments for markdown image handlingClarify preserving markdown image referencesNote single chunker preserves image referencesUpdate OpenAI/Bedrock config and prompts pathInstruct preserving markdown image referencesAdd pymupdf4llm AGPL-3.0 license file2 files
Proxy `/{graph}/graphrag/*` to backendIncrease upload size and timeouts for `/ui`1 files
Replace PyMuPDF with pymupdf4llm; add multipart5 files