π‘οΈ Sentinel: [CRITICAL] Fix RCE in PDF compilation#210
π‘οΈ Sentinel: [CRITICAL] Fix RCE in PDF compilation#210
Conversation
Co-authored-by: anchapin <6326294+anchapin@users.noreply.github.com>
|
π Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a π emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
Reviewer's GuideHardened LaTeXβPDF compilation in both the cover letter generator and generic PDF converter by disabling shell escapes and adding timeouts to external pdflatex/pandoc calls to prevent RCE and DoS. Sequence diagram for secured LaTeX to PDF compilation with timeoutsequenceDiagram
actor User
participant CoverLetterGenerator
participant PDFConverter
participant Subprocess as subprocess_Popen
User->>CoverLetterGenerator: generate_cover_letter()
CoverLetterGenerator->>CoverLetterGenerator: _compile_pdf(output_path, tex_content)
CoverLetterGenerator->>Subprocess: Popen(["pdflatex", "-interaction=nonstopmode", "-no-shell-escape", tex_path.name])
Note over Subprocess: Start pdflatex process in tex_path.parent
CoverLetterGenerator->>Subprocess: communicate(timeout=30)
alt pdflatex completes in time
Subprocess-->>CoverLetterGenerator: stdout, stderr
alt returncode == 0 or PDF exists
CoverLetterGenerator-->>User: pdf_created = True
else pdflatex fails
CoverLetterGenerator->>Subprocess: Popen(["pandoc", tex_path, "-o", output_path, "--pdf-engine=xelatex", "--pdf-engine-opt=-no-shell-escape"])
Note over Subprocess: Start pandoc process
CoverLetterGenerator->>Subprocess: communicate(timeout=30)
alt pandoc completes in time
Subprocess-->>CoverLetterGenerator: stdout, stderr
alt returncode == 0 or PDF exists
CoverLetterGenerator-->>User: pdf_created = True
else pandoc fails
CoverLetterGenerator-->>User: pdf_created = False
end
else pandoc times out
CoverLetterGenerator->>Subprocess: kill()
Subprocess-->>CoverLetterGenerator: communicate()
CoverLetterGenerator-->>User: pdf_created = False
end
end
else pdflatex times out
CoverLetterGenerator->>Subprocess: kill()
Subprocess-->>CoverLetterGenerator: communicate()
Note over CoverLetterGenerator: Abort pdflatex path
CoverLetterGenerator->>Subprocess: Popen(["pandoc", tex_path, "-o", output_path, "--pdf-engine=xelatex", "--pdf-engine-opt=-no-shell-escape"])
CoverLetterGenerator->>Subprocess: communicate(timeout=30)
alt pandoc completes in time
Subprocess-->>CoverLetterGenerator: stdout, stderr
alt returncode == 0 or PDF exists
CoverLetterGenerator-->>User: pdf_created = True
else pandoc fails
CoverLetterGenerator-->>User: pdf_created = False
end
else pandoc times out
CoverLetterGenerator->>Subprocess: kill()
Subprocess-->>CoverLetterGenerator: communicate()
CoverLetterGenerator-->>User: pdf_created = False
end
end
User->>PDFConverter: convert_to_pdf(tex_path, output_path)
PDFConverter->>PDFConverter: _compile_pdflatex(tex_path, working_dir, output_path)
PDFConverter->>Subprocess: Popen(["pdflatex", "-interaction=nonstopmode", "-no-shell-escape", tex_path.name])
PDFConverter->>Subprocess: communicate(timeout=30)
alt pdflatex completes and succeeds
Subprocess-->>PDFConverter: stdout, stderr
PDFConverter-->>User: True
else pdflatex fails or times out
opt timeout path
PDFConverter->>Subprocess: kill()
Subprocess-->>PDFConverter: communicate()
end
PDFConverter->>PDFConverter: _compile_pandoc(tex_path, working_dir, output_path)
PDFConverter->>Subprocess: Popen(["pandoc", tex_path, "-o", output_path, "--pdf-engine=xelatex", "--pdf-engine-opt=-no-shell-escape"])
PDFConverter->>Subprocess: communicate(timeout=30)
alt pandoc completes and succeeds
Subprocess-->>PDFConverter: stdout, stderr
PDFConverter-->>User: True
else pandoc fails or times out
opt timeout path
PDFConverter->>Subprocess: kill()
Subprocess-->>PDFConverter: communicate()
end
PDFConverter-->>User: False
end
end
Updated class diagram for secure PDF compilation helpersclassDiagram
class CoverLetterGenerator {
+_compile_pdf(output_path Path, tex_content str) bool
}
class PDFConverter {
+_compile_pdflatex(tex_path Path, working_dir Path, output_path Path) bool
+_compile_pandoc(tex_path Path, working_dir Path, output_path Path) bool
}
class SubprocessWrapper {
+run_pdflatex(tex_path Path, working_dir Path, output_path Path) bool
+run_pandoc(tex_path Path, working_dir Path, output_path Path) bool
-NO_SHELL_ESCAPE_FLAG str
-TIMEOUT_SECONDS int
}
CoverLetterGenerator ..> SubprocessWrapper : uses
PDFConverter ..> SubprocessWrapper : uses
class ExternalTool {
<<interface>>
+name str
}
class Pdflatex {
+name str
+flags str[]
}
class Pandoc {
+name str
+flags str[]
}
SubprocessWrapper ..> ExternalTool : invokes
Pdflatex ..|> ExternalTool
Pandoc ..|> ExternalTool
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The timeout handling logic around
process.communicateis duplicated in four places; consider extracting a small helper (e.g.,run_with_timeout(cmd, cwd, timeout=30)) to centralize the pattern ofPopen+communicate+ timeout/kill handling. - When a timeout occurs you currently kill the process and immediately return
Falsewithout surfacing any context; consider at least logging or propagating the fact that the failure was due to a timeout so callers and operators can distinguish it from normal compilation errors.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The timeout handling logic around `process.communicate` is duplicated in four places; consider extracting a small helper (e.g., `run_with_timeout(cmd, cwd, timeout=30)`) to centralize the pattern of `Popen` + `communicate` + timeout/kill handling.
- When a timeout occurs you currently kill the process and immediately return `False` without surfacing any context; consider at least logging or propagating the fact that the failure was due to a timeout so callers and operators can distinguish it from normal compilation errors.
## Individual Comments
### Comment 1
<location path="cli/pdf/converter.py" line_range="87-96" />
<code_context>
try:
# Use Popen with explicit cleanup to avoid double-free issues
process = subprocess.Popen(
- ["pdflatex", "-interaction=nonstopmode", tex_path.name],
+ ["pdflatex", "-interaction=nonstopmode", "-no-shell-escape", tex_path.name],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
cwd=tex_path.parent,
)
- stdout, stderr = process.communicate()
+ try:
+ stdout, stderr = process.communicate(timeout=30)
+ except subprocess.TimeoutExpired:
+ process.kill()
+ stdout, stderr = process.communicate()
</code_context>
<issue_to_address>
**issue (bug_risk):** Killing the process on timeout has a small race condition that can raise OSError if the process exits just before kill() is called.
To avoid that exception, wrap the termination calls defensively, e.g.:
```python
try:
stdout, stderr = process.communicate(timeout=30)
except subprocess.TimeoutExpired:
try:
process.terminate()
except OSError:
pass # process already exited
try:
stdout, stderr = process.communicate(timeout=5)
except subprocess.TimeoutExpired:
try:
process.kill()
except OSError:
pass
stdout, stderr = process.communicate()
```
At minimum, wrap `process.kill()` in a `try/except OSError` to prevent spurious failures.
</issue_to_address>Help me be more useful! Please click π or π on each comment and I'll use the feedback to improve your reviews.
| try: | ||
| process = subprocess.Popen( | ||
| ["pdflatex", "-interaction=nonstopmode", tex_path.name], | ||
| ["pdflatex", "-interaction=nonstopmode", "-no-shell-escape", tex_path.name], | ||
| stdout=subprocess.PIPE, | ||
| stderr=subprocess.PIPE, | ||
| cwd=working_dir, | ||
| ) | ||
| stdout, stderr = process.communicate() | ||
| try: | ||
| stdout, stderr = process.communicate(timeout=30) | ||
| except subprocess.TimeoutExpired: |
There was a problem hiding this comment.
issue (bug_risk): Killing the process on timeout has a small race condition that can raise OSError if the process exits just before kill() is called.
To avoid that exception, wrap the termination calls defensively, e.g.:
try:
stdout, stderr = process.communicate(timeout=30)
except subprocess.TimeoutExpired:
try:
process.terminate()
except OSError:
pass # process already exited
try:
stdout, stderr = process.communicate(timeout=5)
except subprocess.TimeoutExpired:
try:
process.kill()
except OSError:
pass
stdout, stderr = process.communicate()At minimum, wrap process.kill() in a try/except OSError to prevent spurious failures.
Co-authored-by: anchapin <6326294+anchapin@users.noreply.github.com>
π¨ Severity: CRITICAL
π‘ Vulnerability: The
CoverLetterGeneratorandPDFConverterclasses were usingpdflatexandpandocto compile LaTeX source into PDFs. However, they lacked the-no-shell-escapeflag and a subprocess timeout. This allows an attacker (or hallucinating AI) to inject LaTeX code capable of executing arbitrary shell commands via\write18or perform Local File Inclusion (LFI). Additionally, an untrusted compilation could cause infinite loops resulting in Denial of Service (DoS).π― Impact: A user generating a cover letter from an AI output or converting a manipulated
.texfile could experience a full Remote Code Execution (RCE) on their machine, or a complete process hang.π§ Fix: Added the
-no-shell-escapeflag to bothpdflatexdirectly andpandoc(via--pdf-engine-opt=-no-shell-escape). Also added a 30-second timeout toprocess.communicate(timeout=30), gracefully catching the exception to kill the process to prevent zombie processes.β Verification: Ran
pytest tests/test_pdf_security.pyandpytest tests/test_cover_letter_security.pyto ensure the vulnerability is mitigated. Also executed the fullpytestsuite ensuring 681 tests pass cleanly.PR created automatically by Jules for task 6615092504026383693 started by @anchapin
Summary by Sourcery
Harden LaTeX-based PDF generation against remote code execution and hangs in both cover letter generation and generic PDF conversion flows.
Bug Fixes: