Fix path traversal and unit test gate stability#162
Fix path traversal and unit test gate stability#162heidi-dang wants to merge 1 commit intofeat/bootstrap-scaffoldfrom
Conversation
This commit addresses several security and stability issues: 1. Fixes a path traversal vulnerability in telemetry and dashboard by sanitizing `run_id` inputs. 2. Fixes a SyntaxError in the unit test gate caused by improper indentation of multi-line code blocks. 3. Modernizes UTC timestamp generation to avoid deprecation warnings in Python 3.12+. 4. Removes a redundant and broken cache check in the telemetry module. 5. Standardizes the default AUTOTRAIN_DIR to ~/.local/heidi_engine for consistency across tools.
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the security and stability of the system. It addresses potential path traversal vulnerabilities by sanitizing user inputs, preventing unauthorized file access. Furthermore, it improves the reliability of the unit test gate by correcting an indentation bug that affected multi-line code execution. The changes also include modernizing datetime handling and cleaning up faulty cache logic in the telemetry module, contributing to overall code quality and maintainability. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request addresses path traversal vulnerabilities, improves unit test stability, and updates deprecated datetime functions. I've reviewed the changes and have one suggestion in heidi_engine/telemetry.py to improve the efficiency and clarity of the get_run_id function. The security fixes and other updates appear correct.
| global RUN_ID | ||
| if not RUN_ID: | ||
| RUN_ID = os.environ.get("RUN_ID", "") | ||
| if not RUN_ID: | ||
| RUN_ID = str(uuid.uuid4())[:8] | ||
| RUN_ID = f"run_{datetime.now().strftime('%Y%m%d_%H%M%S')}_{RUN_ID}" | ||
|
|
||
| # SECURITY: Sanitize RUN_ID to prevent path traversal | ||
| RUN_ID = Path(RUN_ID).name | ||
| return RUN_ID |
There was a problem hiding this comment.
The current implementation of get_run_id re-evaluates and re-sanitizes the global RUN_ID on every call, which is inefficient. A better approach is to check for the existence of the global RUN_ID at the beginning and return it if it's already set. The generation and sanitization logic should only run once when the RUN_ID is first requested.
| global RUN_ID | |
| if not RUN_ID: | |
| RUN_ID = os.environ.get("RUN_ID", "") | |
| if not RUN_ID: | |
| RUN_ID = str(uuid.uuid4())[:8] | |
| RUN_ID = f"run_{datetime.now().strftime('%Y%m%d_%H%M%S')}_{RUN_ID}" | |
| # SECURITY: Sanitize RUN_ID to prevent path traversal | |
| RUN_ID = Path(RUN_ID).name | |
| return RUN_ID | |
| global RUN_ID | |
| if RUN_ID: | |
| return RUN_ID | |
| run_id_val = os.environ.get("RUN_ID", "") | |
| if not run_id_val: | |
| run_id_val = str(uuid.uuid4())[:8] | |
| run_id_val = f"run_{datetime.now().strftime('%Y%m%d_%H%M%S')}_{run_id_val}" | |
| # Sanitize, set global, and return | |
| RUN_ID = Path(run_id_val).name | |
| return RUN_ID |
Implemented path traversal protection in
heidi_engine/telemetry.pyandheidi_engine/dashboard.pyby usingPath(run_id).nameto sanitize user-provided run identifiers. This prevents malicious actors from reading or writing files outside the intendedruns/directory.Additionally, fixed a critical bug in
scripts/03_unit_test_gate.pywhere multi-line Python code samples would fail to execute due to incorrect indentation within the execution wrapper. The wrapper now usestextwrap.indentto ensure proper nesting.Finally, modernized
heidi_engine/telemetry.pyby replacing deprecateddatetime.utcnow()calls withdatetime.now(timezone.utc)and removed a broken redundant cache check that incorrectly referenced an undefined variable.All changes were verified with targeted scripts and existing project tests (21 passed). Temporary artifacts were removed before submission.
PR created automatically by Jules for task 3327075848239083014 started by @heidi-dang