Fix shell injection vulnerability and preserve command substitution syntax#26
Conversation
Replace create_subprocess_shell() with create_subprocess_exec() to prevent arbitrary command execution through malicious job specifications. Arguments are now passed as a list directly to exec instead of being joined into a shell command string. - Changed asyncio.create_subprocess_shell to create_subprocess_exec - Remove string concatenation with " ".join() - Updated tests to verify list-based argument passing - Added test case to ensure special characters are handled safely Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
The expandvars library was incorrectly processing command substitution
syntax like $(hostname) and converting it to (hostname), causing bash
syntax errors when jobs were executed.
Added utility functions to preserve command substitution patterns:
- find_command_substitutions(): Parses $(cmd) and \`cmd\` with proper
nested parentheses support using a depth-tracking algorithm
- expand_vars_preserve_commands(): Replaces command substitutions with
unique placeholders before expandvars processing, then restores them
This maintains all expandvars features (${VAR:-default}, nested vars, etc)
while ensuring shell command substitutions are passed through unchanged.
Changes:
- gator/common/utility.py: Added command substitution preservation utilities
- gator/wrapper.py: Updated to use new utility functions for arg expansion
- tests/test_command_substitution.py: Comprehensive test suite (25 tests)
- tests/test_wrapper.py: Added integration test for command substitution
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Enhanced find_command_substitutions() to properly handle shell quoting: - Tracks single and double quote context while parsing - Respects backslash escaping (only in double quotes and outside quotes) - Correctly handles parentheses inside quoted strings like $(echo ")") - Prevents premature closing on quoted characters Added 28 comprehensive test cases covering: - Parentheses in single and double quotes - Escaped characters and backslashes - Nested quotes in nested command substitutions - Mixed quoting scenarios - ANSI-C quoting ($'...') - Arithmetic expansion $((..)) - Parameter expansion with special chars - Commands with pipes, semicolons, redirects - Glob patterns and brace expansion This ensures commands like $(echo ")") are correctly parsed as a single command substitution rather than failing on the quoted closing paren. All 134 tests pass (106 original + 28 new).
When running non-shell commands (anything other than bash, sh, zsh, etc.), Gator now logs a warning if command substitutions like $(cmd) or `cmd` are detected in the command or arguments. This helps users understand that Gator only evaluates simple environment variables and command substitutions will pass through unevaluated to the target command. Shell commands are excluded from this warning since they handle command substitutions natively. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Intuity
left a comment
There was a problem hiding this comment.
I can't recreate the documented issue with expandvars - this needs more justification to add a custom expansion, as its significant code being introduced.
While I'm happy with the switch to using subprocess exec instead of shell due to lower resources, I'm not convinced in the justification - I don't believe it's in gator's purview to be sanity checking what the user is asking to execute.
gator/common/utility.py
Outdated
| def expand_vars_preserve_commands(text: str, environ: dict) -> str: | ||
| """ | ||
| Expand environment variables using expandvars, but preserve command |
There was a problem hiding this comment.
I don't understand the need for this, from trialling expandvars locally:
>>> import expandvars
>>> expandvars.expand("echo my command with $(pwd) and $USER $((1+2))")
'echo my command with $(pwd) and peterbirch $((1+2))'The $(pwd) and $((1+2)) (examples given) are preserved and not eliminated
Can you provide an example of where this has been an issue?
There was a problem hiding this comment.
I have given an example in the PR. The $ is stripped off meaning it's not preserved. expandvars does not handle $( properly.
There was a problem hiding this comment.
Maybe there's a version mismatch issue on the package??
There was a problem hiding this comment.
I just ran poetry run python within Gator, and this is what I get:
>>> import expandvars
>>> expandvars.expand("echo my command with $(pwd) and $USER $((1+2)) $(hostname)")
'echo my command with (pwd) and ednutting ((1+2)) (hostname)'
>>>
There was a problem hiding this comment.
For reference:
(gator-eda-py3.11) ednutting:Gator$ poetry show expandvars
name : expandvars
version : 0.9.0
description : Expand system variables Unix style
(gator-eda-py3.11) ednutting:Gator$ There was a problem hiding this comment.
Okay, confirmed, newer version of expandvars doesn't have the $( issue.
There was a problem hiding this comment.
Upgraded to newer package version and simplified the code.
The code could be simplified further to make the warning-detection code simpler/shorter by using a simple regex rather than the basic parser, but at the risk of generating slightly "noisier" warnings as any dollar or backtick (after expandvars has done its thing) would cause a warning to be generated.
Which would you prefer, simpler/shorter code with noisier warnings, or the current simple parser with more accurate warnings? Or just eliminate the warning altogether? (Though I think eliminating the warning may end up being a gotcha for developers in the future)
There was a problem hiding this comment.
Simpler approach committed on a separate branch: EdNutting@65237c5
| self.launched_processes[task.ident] = await asyncio.create_subprocess_shell( | ||
| " ".join(self.create_command(task, {"concurrency": granted})), | ||
| self.launched_processes[task.ident] = await asyncio.create_subprocess_exec( | ||
| *self.create_command(task, {"concurrency": granted}), |
There was a problem hiding this comment.
Having re-read the asyncio documentation, I can see why an LLM check may have flagged this as a security vulnerability as it says create_subprocess_shell is at risk of injection vulnerabilities
However...
Given that the whole point of gator is that you can submit any arbitrary command through a job - I'd argue this is not a "security fix".
I'm in agreement with the change as there's still a benefit here in that a sub-shell is not invoked, which makes it slightly lighter on resources.
There was a problem hiding this comment.
I'd mostly agree - I had the same initial view - and then I realised this subprocess invocation isn't supposed to be causing user-provided commands/args to be interpreted at this point on the host that runs this invocation. So, whether you think it's a security issue or not, it's a genuine problem that commands may end up running on a different machine / in a different place than the user may have intended or expected.
| # Check all monitors were fired up | ||
| as_mon.assert_has_calls([call(f"T{x}", y) for x, y in zip(range(10), procs)]) | ||
|
|
||
| async def test_scheduler_argument_safety(self, mocker, tmp_path): |
There was a problem hiding this comment.
I'm not sure I agree with the intent of this test - I don't think it falls to gator to protect the user from submitting jobs that can do damage. Gator is scheduling arbitrary commands, if that happens to contain an rm -rf / then that's on the upstream/encasing system to detect/prevent.
There was a problem hiding this comment.
This is protecting Gator from injecting stuff into itself that it didn't intend to, given that Gator invokes itself in a nested fashion where the nesting does end up including user-provided input which the user might not have expected would end up in an intermediary shell command and thus interpreted in the wrong place.
It happened to me, albeit with a non-destructive command.
Fix shell injection vulnerability and preserve command substitution syntax
This PR addresses a potential security vulnerability, and fixes incorrect handling of shell command substitution syntax in job arguments.
Summary
Three key issues resolved:
create_subprocess_shell()withcreate_subprocess_exec()inLocalSchedulerto prevent arbitrary command execution through malicious job specificationsexpandvarsincorrectly stripping command substitution syntax like$(hostname)and$((3 + 2)), which caused bash syntax errorsBehavior
The fix ensures proper handling of shell syntax based on the command being executed:
Execution results:
With bash/sh: Command substitutions are executed as expected
Without shell: Literal strings are preserved; env vars are substituted
Technical Changes
Test Plan
🤖 Co-authored with Claude Code