Conversation
- Add redirect operator detection and handling (_isRedirectOp, _getRedirectAdvance) - Support command substitutions inside redirect targets (e.g., echo x >) - Handle redirects inside nested command substitutions properly - Fix ambiguous numeric args before redirects to be preserved for analysis
- Add tests for redirects in rm commands (2>/dev/null, > /dev/null) - Test command substitution in redirect targets - Test nested command substitutions with redirects - Verify numeric args before redirects are preserved and analyzed
…edirect target parsing
…ithmetic expansion
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (2)
📒 Files selected for processing (2)
📝 WalkthroughWalkthroughRewrites shell parsing in Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #34 +/- ##
==========================================
- Coverage 99.53% 99.50% -0.03%
==========================================
Files 50 50
Lines 4540 5085 +545
==========================================
+ Hits 4519 5060 +541
- Misses 21 25 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Greptile SummaryThis PR fixes a false positive (#29) where Key changes and observations:
Confidence Score: 3/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["splitShellCommands(command)"] --> B["_stripAttachedIoNumbers()\nStrip IO-number digits (2>, 1>)\nConvert $((…)) to ARITH_SENTINEL\nConvert backtick to $(…)"]
B --> C["shell-quote parse()\nProduces ParseEntry tokens"]
C --> D{Token classifier}
D -->|Shell operator| E["Flush current segment\nStart new segment"]
D -->|Redirect op| F["_getRedirectTargetInfo()\nExtract redirect target string"]
F -->|Target contains dollar-paren| G["_pushInlineSubstitutionSegments()\nextractInlineCommandSubstitutions()"]
F -->|Target is glob pattern| H["advance:2 — skip glob, no segment"]
D -->|Process substitution| I["extractProcessSubstitution()\nRecurse into body"]
D -->|Dollar-paren cmd sub| J["extractCommandSubstitution()"]
J -->|Starts with ARITH_SENTINEL| K["_extractArithmeticSubstitution()\nCollect expression text\nExtract nested cmd subs"]
J -->|Normal content| L["Depth-track parens\nOperators flush segments\nRedirects call _getRedirectTargetInfo"]
D -->|String ending in dollar| M["_isAttachedCommandSubstitutionStart\nExtract prefix, recurse into sub"]
D -->|Plain string or glob| N["_getCommandTokenText()\n_pushInlineSubstitutionSegments()\ncurrent.push(token)"]
E --> D
G --> D
I --> D
K --> D
L --> D
M --> D
N --> D
Last reviewed commit: "fix(shell): handle g..." |
…hmetic expansion parsing
…arithmetic expansion
…rsing Add five test cases to improve line coverage for src/core/shell.ts from 95.62% to 99.31%: - Process substitution nested inside command substitution (lines 236-245) - Backtick with attached path suffix inside command substitution (line 271) - Attached command substitution inside command substitution (lines 278-292) - Unclosed backtick handling without hanging (lines 485-488, 1009, 1011) - Operator token inside parenthesized redirect target (lines 1060-1061) Remaining uncovered lines (895, 897, 1019, 1060-1061) are defensive/unreachable code where callers pre-check conditions before calling these code paths.
|
@greptileai review the PR |
|
|
||
| if (!inSingle && char === '`') { | ||
| const endIndex = _findBacktickEnd(command, i + 1); | ||
| if (endIndex === -1) { | ||
| result += char; | ||
| atTokenBoundary = false; | ||
| i++; | ||
| continue; | ||
| } | ||
|
|
||
| result += `$(${command.slice(i + 1, endIndex)})`; | ||
| if ( | ||
| atTokenBoundary && | ||
| command[endIndex + 1] && | ||
| _isPathLikeBacktickSuffix(command[endIndex + 1] as string) | ||
| ) { | ||
| result += BACKTICK_ATTACHED_SUFFIX_SENTINEL; | ||
| } | ||
| atTokenBoundary = false; | ||
| i = endIndex + 1; | ||
| continue; |
There was a problem hiding this comment.
Backtick inside double-quoted string converts to
$(...) before shell-quote sees it
_stripAttachedIoNumbers runs before shell-quote and converts every unquoted backtick to $(...) form. The guard is only !inSingle, so backticks inside double-quoted strings are also transformed.
In bash, "foo`cmd`bar" is valid command-substitution-inside-a-double-quoted-string, so the transformation is semantically correct. However, after the transformation the result is "foo$(cmd)bar", still inside the double-quoted span. When shell-quote later tokenises this, the whole "foo$(cmd)bar" token ends up as a single string token (shell-quote doesn't split command substitutions inside double-quoted tokens). That string token is later passed to extractInlineCommandSubstitutions, which does handle $(...) inside strings, so the nested command is analysed correctly.
The subtle issue is the BACKTICK_ATTACHED_SUFFIX_SENTINEL injection that happens at line 497: the sentinel is appended when atTokenBoundary is true and the character after the closing backtick is / or .. At this point we are inside inDouble = true, so atTokenBoundary was set to false when the " was processed (line 477). That means the sentinel is never appended for backtick-in-double-quotes, so this edge case does not actually fire and the sentinel logic is harmless here.
The real concern is a different one: if inDouble is true when the backtick is encountered, the backtick-end search at line 483 (_findBacktickEnd) does not account for quote context — it simply looks for the next unescaped backtick regardless of whether that backtick is inside a nested single-quoted span. A fragment like "'a`b`c'" (backtick inside single-quotes inside double-quotes) would incorrectly be transformed into "'a$(bc'"(splitting at the first backtick and leaving a stray backtick). Consider adding a!inDoubleguard, or making_findBacktickEnd` quote-aware.
| if (typeof token === 'string') { | ||
| _pushInlineSubstitutionSegments(innerSegments, token); | ||
| expression += token; | ||
| i++; | ||
| continue; | ||
| } |
There was a problem hiding this comment.
_extractArithmeticSubstitution: expression retained in output after inline-sub extraction
When a plain string token is encountered inside an arithmetic expression (e.g., the foo in $((foo+$(cmd)))), both things happen:
_pushInlineSubstitutionSegments(innerSegments, token)— scans the token for any embedded$(...)and appends any found inner-command segments toinnerSegments.expression += token— the raw token text (which still contains any$(...)syntax) is appended to the running expression string.
If that token contains an embedded substitution such as foo$(bar)baz (which shell-quote would deliver as the single string foo$(bar)baz), step 1 correctly extracts ['bar'] into innerSegments, but step 2 leaves foo$(bar)baz in expression. When the expression is eventually flushed as [expression] it will contain the raw $(bar) syntax. This segment is then seen by the rule engine as an expression string — not a danger in practice today because no rule would match on $(bar), but it makes the expression segment misleading and inconsistent with how the same situation is handled in extractCommandSubstitution (which clears currentSegment before recursing into the inner substitution).
Consider stripping the already-extracted inline substitution spans from the expression string before appending, or skipping the expression += step for any portion that was captured by _pushInlineSubstitutionSegments.
|
@greptileai review the PR |
| if (tokens[startIndex] === ARITHMETIC_SENTINEL) { | ||
| return _extractArithmeticSubstitution(tokens, startIndex); | ||
| } |
There was a problem hiding this comment.
Sentinel collision causes false negative for crafted commands
extractCommandSubstitution checks tokens[startIndex] === ARITHMETIC_SENTINEL before anything else. If an analyzed command contains the literal sentinel string in a $(...) substitution — e.g. $(__CC_SAFETY_NET_ARITH_SENTINEL__ rm -rf /) — then _extractArithmeticSubstitution is invoked instead of the normal command-parsing path.
Inside _extractArithmeticSubstitution, plain string tokens (like rm, -rf, /) are appended to an expression buffer and flushed as a single joined segment ['rm-rf/']. Because those tokens are concatenated without separators, the resulting segment rm-rf/ does not match the rm rule and the dangerous command is never detected.
Nested $(...) inside the sentinel-triggered path are still analyzed correctly (the _isCommandSubstitutionStart guard at line 321 handles them), so the false negative only applies to direct argument tokens at the top level of the fake arithmetic expression.
While the sentinel is sufficiently opaque to avoid accidental collisions in real shell scripts, the fix is inexpensive: export the sentinel as an unexported module-level symbol and make extractCommandSubstitution verify that the sentinel was actually injected by _stripAttachedIoNumbers (e.g. by checking the preceding token was $ and the one before that was the arithmetic $(( position) rather than trusting that any matching string is legitimate. Alternatively, keep the sentinel approach but additionally verify that tokens[startIndex - 1] is a ( operator (i.e. the sentinel was placed there by the pre-processor).
| if (depth === 1 && _isRedirectOp(token)) { | ||
| const { redirectTarget, advance } = _getRedirectTargetInfo(tokens, i); | ||
| if (redirectTarget !== null) { | ||
| _pushInlineSubstitutionSegments(innerSegments, redirectTarget); | ||
| } | ||
| i += advance; | ||
| continue; | ||
| } |
There was a problem hiding this comment.
Redirect targets inside grouped subshells (depth > 1) are added as command arguments
The _isRedirectOp guard at line 248 is wrapped in depth === 1, so it only fires for redirects at the outermost level of a $(...) substitution. For commands like echo $( (rm -rf /tmp/foo 2>/dev/null) ), the > operator at depth 2 reaches _getCommandTokenText({op: '>'}), which returns null (no pattern property) and is silently skipped with i++. The redirect target token /dev/null that immediately follows it IS a plain string, so it gets pushed onto currentSegment, producing the segment ['rm', '-rf', '/tmp/foo', '/dev/null'].
In this specific case the false positive is harmless (the rm-rule flags /dev/null as an unrecognised target and blocks conservatively), but the same mechanism means:
echo $( (cmd 2>/dev/null; rm -rf /) )at depth 2 produces['cmd', '/dev/null', 'rm', '-rf', '/']— all concatenated into one segment, so the rule engine sees commandcmdwith/dev/null rm -rf /as arguments and may not fire thermrule.- Any "safe" command inside a grouped subshell with a redirect is flagged because its redirect target appears to be an extra argument.
The fix is to move the _isRedirectOp check (and the accompanying advance) outside the depth === 1 guard, or, simpler, to apply it for depth >= 1:
- if (depth === 1 && _isRedirectOp(token)) {
+ if (_isRedirectOp(token)) {The existing depth-tracking already handles the paren balance correctly, so no other changes are needed.
|
@greptileai review the PR |
Summary
rm -rffalse positive where2>/dev/nullredirect was parsed as positional arguments (rm -rffalse positive:2>/dev/nullredirect parsed as positional arguments #29)$((...)), grouped subshells, positional args, and nested command substitutionsshell-quotesplitting these constructs into unexpected token sequencesChanges
src/core/shell.ts— Major parsing improvements:2>) before redirect operators so they aren't misinterpreted as command arguments$((...))instead of treating inner content as executable commands_getRedirectTargetInfo) that extracts embedded command substitutions from redirect targets$(...)substitution extraction (extractInlineCommandSubstitutions) for tokens containing embedded command substitutionsshell-quote{ pattern }tokens and positional arg tokens ($1,$@, etc.) gracefullytests/core/command.test.ts— Added tests for redirect parsing, nested command substitutions inside redirects, and IO-number strippingtests/core/edge-cases.test.ts— Added tests for arithmetic expansion, grouped subshells, and quoted inline substitutionstests/core/parsing-helpers.test.ts— Added tests forsplitShellCommandscovering positional args, globs with pattern tokens, backtick redirect targets, and escaped content in inline substitutionstests/core/rules-rm.test.ts— Added tests forrm -rfwith redirects (2>/dev/null) and attached IO-number disambiguationtests/core/system-info.test.ts— Increased timeout for doctor system-info testsdist/— Rebuilt distribution bundlesTesting
Related Issue
Fixes #29 —
rm -rffalse positive:2>/dev/nullredirect parsed as positional argumentsSupersedes #30
PR Checklist
bun run checkpasses (lint, types, dead code, rules, tests)package.jsonSummary by CodeRabbit
Refactor
Tests