Skip to content

fix(fuzz): return fresh MultiPartForm instance to prevent concurrent map writes#7138

Closed
usernametooshort wants to merge 5 commits intoprojectdiscovery:devfrom
usernametooshort:fix/multipartform-concurrent-map-writes
Closed

fix(fuzz): return fresh MultiPartForm instance to prevent concurrent map writes#7138
usernametooshort wants to merge 5 commits intoprojectdiscovery:devfrom
usernametooshort:fix/multipartform-concurrent-map-writes

Conversation

@usernametooshort
Copy link

@usernametooshort usernametooshort commented Mar 6, 2026

Summary

Fixes #7028 — when fuzzing multipart/form-data bodies across multiple targets concurrently, nuclei crashes with fatal error: concurrent map writes.

Root Cause

MultiPartForm is the only stateful DataFormat implementation — it has:

  • boundary string field (set by ParseBoundary)
  • filesMetadata map[string]FileMetadata (written by Decode)

However, it is registered as a singleton in dataformat.init():

func init() {
    RegisterDataFormat(NewMultiPartForm())  // single instance
}

When processing multiple targets concurrently, each goroutine calls:

executeFuzzingRule → parseBody("multipart/form-data")
  → dataformat.Get("multipart/form-data")  ← returns same singleton
  → singleton.ParseBoundary() → m.boundary = ...
  → singleton.Decode() → m.filesMetadata[key] = ...  ← concurrent writes

This triggers Go runtime fatal error (cannot be recovered).

Fix

Change dataformat.Get() to return a fresh instance for MultiPartFormDataFormat instead of the singleton. All other formats remain singleton (they are stateless).

Testing

After the fix, concurrent fuzzing across multiple targets works without crashes:

dataformat.Get("multipart/form-data") → NewMultiPartForm()  // fresh instance per call

Each goroutine gets its own boundary and filesMetadata, eliminating the race.

Summary by CodeRabbit

Release Notes

  • Bug Fixes
    • Fixed race conditions in parallel workflow execution and concurrent format processing
    • Improved error handling for GitHub API rate limits during template operations
    • Added early detection of unresolved template variables to prevent encoding errors
    • Enhanced HTTP request caching with proper scheme-based separation

The second goroutine in runWorkflowStep was not cloning ctx.Input
before passing it to runWorkflowStep, unlike the first goroutine
which properly calls ctx.Input.Clone().

The comment on both paths says 'clone the Input so that other parallel
executions won't overwrite the shared variables' but only the first
path was actually doing it.

This causes a race condition where parallel subtemplate executions
can corrupt each other's input data when modifying shared variables.
…ure them

When a template variable like {{contact_id}} is not provided by the user
and gets passed through an encoding function (e.g. base64(rawhash) where
rawhash contains {{contact_id}}), the ContainsUnresolvedVariables check
on the final dumped request fails to detect it — the base64-encoded form
e3tjb250YWN0X2lkfX0= contains no {{...}} markers.

Fix: after EvaluateWithInteractsh builds the variablesMap, check each
variable value for unresolved {{...}} markers before allVars is assembled
and passed to expression evaluation. This catches self-referential
variables (e.g. contact_id: "{{contact_id}}") that were never provided
by the user via -var.

Fixes projectdiscovery#7032
…te install

When nuclei-templates are not installed and the GitHub API returns a
rate limit error (403), the SDK panics because NewNucleiEngineCtx
propagates the error up to the caller.

Fix: detect rate limit errors in FreshInstallIfNotExists and
UpdateIfOutdated, and log a warning instead of returning a fatal error.
This allows the engine to start even without templates (users can retry
later or set GITHUB_TOKEN for higher rate limits).

Fixes projectdiscovery#7118
The project file (-project) uses a hash of the dumped HTTP request as
the cache key. Since http.Request.Dump() produces only the raw HTTP/1.1
request line (e.g. 'GET / HTTP/1.1') without the scheme, requests to
http://example.com and https://example.com produce identical cache keys.

This causes HTTPS responses to be served for HTTP requests (and vice
versa) when using the project cache, leading to false positives.

Fix: prefix the dumped request with the full URL (including scheme)
before hashing, so http:// and https:// produce distinct cache entries.

Fixes projectdiscovery#6866
…map writes

MultiPartForm has stateful fields (boundary string + filesMetadata map).
When registered as a singleton in dataformat.init() and returned by
dataformat.Get(), multiple goroutines processing different targets
concurrently write to the same map, triggering:
  fatal error: concurrent map writes

Fix: check if the requested format is multipart/form-data in Get() and
return NewMultiPartForm() instead of the singleton. All other formats
remain singleton (they are stateless).

Fixes projectdiscovery#7028
@auto-assign auto-assign bot requested a review from dwisiswant0 March 6, 2026 09:35
@neo-by-projectdiscovery-dev
Copy link

neo-by-projectdiscovery-dev bot commented Mar 6, 2026

Neo - PR Security Review

No security issues found

Highlights

  • Changes MultiPartForm from singleton to fresh-instance pattern to prevent concurrent map writes during multi-target fuzzing
  • Ensures proper data isolation by giving each goroutine its own boundary and filesMetadata map
  • Maintains existing input validation including RFC-2046 boundary length enforcement (70 char limit)
Hardening Notes
  • The root cause (concurrent map writes) triggers Go runtime fatal errors — this is a reliability/stability issue rather than an exploitable security vulnerability
  • The fix creates a new MultiPartForm instance per call to dataformat.Get(). These instances are short-lived (created for parsing one request, then garbage collected), so resource exhaustion is not a concern
  • All other DataFormat implementations (JSON, XML, Form, Raw) remain as singletons since they have no mutable state
  • The boundary validation at multipart.go:153-155 correctly enforces RFC-2046's 70-character limit, preventing potential DoS from oversized boundaries
  • The Decode() method at multipart.go:170 uses a fixed 32MB memory limit for parsing, which provides protection against memory exhaustion attacks

Comment @pdneo help for available commands. · Open in Neo

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 6, 2026

Walkthrough

This PR addresses concurrency issues in multipart form-data fuzzing by creating fresh instances per goroutine instead of reusing singletons, adds graceful rate limit error handling for template operations, introduces early validation for unresolved template variables, and improves HTTP request caching by including URL scheme information.

Changes

Cohort / File(s) Summary
Concurrency & Data Isolation
pkg/core/workflow_execute.go, pkg/fuzz/dataformat/dataformat.go
Input cloning for parallel execution contexts and fresh multipart/form-data instantiation prevent concurrent map writes and race conditions when processing multiple targets simultaneously.
Error Handling
pkg/installer/template.go
New isRateLimitError() helper detects GitHub API rate limits; FreshInstallIfNotExists and UpdateIfOutdated now treat rate limit errors as non-fatal, logging warnings and enabling retries instead of failing immediately.
Validation & Caching
pkg/protocols/http/build_request.go, pkg/protocols/http/utils.go
Early detection of unresolved template variables before encoding prevents silent failures; HTTP request dumps now include full URL with scheme for proper cache-key separation.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 Cloning inputs, fresh forms in flight,
Concurrency crashes? Now held tight!
Rate limits soften with graceful care,
Variables checked before they're laid bare,
Cache keys bloom with scheme-aware might!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning The PR contains multiple changes unrelated to the MultiPartForm singleton issue: workflow execution context cloning, HTTP request cache-key changes, unresolved variable detection, rate limit handling in template installation, and GitHub API rate limit error detection. Move unrelated changes (workflow_execute.go, http/utils.go, http/build_request.go, installer/template.go) to separate focused PRs to keep this fix isolated and reviewable.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately describes the primary fix in the changeset: returning a fresh MultiPartForm instance to prevent concurrent map writes during fuzzing.
Linked Issues check ✅ Passed The PR addresses issue #7028's objectives: identifies MultiPartForm as stateful, implements the fix to return fresh instances per-call, prevents concurrent writes to boundary and filesMetadata, and enables concurrent fuzzing across multiple targets.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/fuzz/dataformat/dataformat.go`:
- Around line 41-49: Get() was updated to return a fresh MultiPartForm but
Encode() still uses the package-level registry entry
(dataformats[dataformat].Encode), allowing shared mutation; change Encode to
call Get(dataformat) and then call Encode on that returned DataFormat (so
stateful formats like MultiPartForm returned by NewMultiPartForm are per-call
instances), ensuring you reference the registry via Get rather than accessing
the dataformats map directly (update any Encode wrapper or method that currently
uses dataformats[...] to use Get(...).Encode(...)).

In `@pkg/installer/template.go`:
- Around line 28-35: The isRateLimitError helper currently treats any error
containing "403" as a rate-limit event which is too broad; update
isRateLimitError to detect rate limits more precisely by either type-asserting
the error to the API client's rate-limit error type (e.g.,
*github.RateLimitError) if available or by matching a specific rate-limit
message substring such as "rate limit exceeded" or "API rate limit exceeded" (do
not rely on the generic "403" match). Ensure the function still returns false
for ordinary 403/forbidden errors and only returns true for genuine rate-limit
indications.
- Around line 94-98: The early return in installTemplatesAt when
isRateLimitError(err) is true leaves the pre-created TemplatesDirectory on disk,
causing FreshInstallIfNotExists to believe installation succeeded; modify the
isRateLimitError branch inside installTemplatesAt to remove the partially
created TemplatesDirectory (e.g., call os.RemoveAll on the TemplatesDirectory
path or otherwise mark it incomplete) before returning, and log any cleanup
error via gologger.Warning/Error; ensure the same cleanup behavior is applied to
other early-return error paths in installTemplatesAt so FreshInstallIfNotExists
will retry installation later.

In `@pkg/protocols/http/build_request.go`:
- Around line 216-228: The unresolved-variable check runs before constants are
merged, causing preprocessor-backed vars to be falsely flagged; move or repeat
the merge of r.request.options.Constants into the variables map (or build the
full allVars set) prior to running the loop that calls
expressions.ContainsUnresolvedVariables on variablesMap; specifically ensure the
constants from r.request.options.Constants (or r.options.Constants if that name
is used) are applied to variablesMap/allVars before the check in
build_request.go so that variables like {{randstr}} are resolved when inspected,
and keep the existing SkipVariablesCheck guard (r.request.SkipVariablesCheck)
intact.

In `@pkg/protocols/http/utils.go`:
- Around line 23-28: The raw-request code path fails to prefix dumps with the
full URL, causing http/https cache key collisions; update the logic in the
function that builds `bin` so both branches prepend the full URL (as done with
`if fullURL := req.request.String(); fullURL != "" { bin =
append([]byte(fullURL+"\n"), bin...) }`) to the bytes produced by
`rawhttp.DumpRequestRaw(...)` (and any other dump path) before returning; locate
uses of `req.request`, `rawhttp.DumpRequestRaw`, and the `bin` variable and
ensure the same fullURL prefixing is applied for raw requests as for
retryablehttp requests.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f570c6c3-2358-4e25-b8d0-517bb241a4c8

📥 Commits

Reviewing files that changed from the base of the PR and between 884d0b5 and 2976b95.

📒 Files selected for processing (5)
  • pkg/core/workflow_execute.go
  • pkg/fuzz/dataformat/dataformat.go
  • pkg/installer/template.go
  • pkg/protocols/http/build_request.go
  • pkg/protocols/http/utils.go

Comment on lines +41 to 49
// Get returns the dataformat by name.
// For stateful formats like multipart/form-data, returns a fresh instance to avoid
// concurrent map writes when multiple goroutines process different targets.
func Get(name string) DataFormat {
// MultiPartForm is stateful (boundary + filesMetadata map) — return new instance
if name == MultiPartFormDataFormat {
return NewMultiPartForm()
}
return dataformats[name]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

multipart/form-data is still shared through the package-level Encode path.

Get() now returns a fresh MultiPartForm, but Encode() still does dataformats[dataformat].Encode(data) against the singleton stored in the registry. Any caller using Encode(data, MultiPartFormDataFormat) can still mutate the same stateful instance across goroutines, so this fix is incomplete.

Suggested fix
 func Encode(data KV, dataformat string) (string, error) {
 	if dataformat == "" {
 		return "", errors.New("dataformat is required")
 	}
-	if encoder, ok := dataformats[dataformat]; ok {
-		return encoder.Encode(data)
+	if encoder := Get(dataformat); encoder != nil {
+		return encoder.Encode(data)
 	}
 	return "", fmt.Errorf("dataformat %s is not supported", dataformat)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/fuzz/dataformat/dataformat.go` around lines 41 - 49, Get() was updated to
return a fresh MultiPartForm but Encode() still uses the package-level registry
entry (dataformats[dataformat].Encode), allowing shared mutation; change Encode
to call Get(dataformat) and then call Encode on that returned DataFormat (so
stateful formats like MultiPartForm returned by NewMultiPartForm are per-call
instances), ensuring you reference the registry via Get rather than accessing
the dataformats map directly (update any Encode wrapper or method that currently
uses dataformats[...] to use Get(...).Encode(...)).

Comment on lines +28 to +35
// isRateLimitError checks if the error is related to GitHub API rate limiting
func isRateLimitError(err error) bool {
if err == nil {
return false
}
errStr := err.Error()
return strings.Contains(errStr, "rate limit") || strings.Contains(errStr, "403")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

"403" is too broad for rate-limit detection.

This helper now treats any error string containing 403 as a rate-limit event. That will also classify ordinary forbidden/auth failures as rate-limited, and the fresh-install path below will silently convert those real install failures into success.

Suggested fix
 func isRateLimitError(err error) bool {
 	if err == nil {
 		return false
 	}
-	errStr := err.Error()
-	return strings.Contains(errStr, "rate limit") || strings.Contains(errStr, "403")
+	errStr := strings.ToLower(err.Error())
+	return strings.Contains(errStr, "rate limit")
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// isRateLimitError checks if the error is related to GitHub API rate limiting
func isRateLimitError(err error) bool {
if err == nil {
return false
}
errStr := err.Error()
return strings.Contains(errStr, "rate limit") || strings.Contains(errStr, "403")
}
// isRateLimitError checks if the error is related to GitHub API rate limiting
func isRateLimitError(err error) bool {
if err == nil {
return false
}
errStr := strings.ToLower(err.Error())
return strings.Contains(errStr, "rate limit")
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/installer/template.go` around lines 28 - 35, The isRateLimitError helper
currently treats any error containing "403" as a rate-limit event which is too
broad; update isRateLimitError to detect rate limits more precisely by either
type-asserting the error to the API client's rate-limit error type (e.g.,
*github.RateLimitError) if available or by matching a specific rate-limit
message substring such as "rate limit exceeded" or "API rate limit exceeded" (do
not rely on the generic "403" match). Ensure the function still returns false
for ordinary 403/forbidden errors and only returns true for genuine rate-limit
indications.

Comment on lines +94 to +98
if isRateLimitError(err) {
gologger.Warning().Msgf("GitHub API rate limit hit while installing templates: %v", err)
gologger.Warning().Msgf("Run again later or set GITHUB_TOKEN env variable for higher rate limits")
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Returning nil here leaves a sticky partial install.

installTemplatesAt() creates TemplatesDirectory before downloading. If the download then hits this branch, the directory remains on disk and Line 89 will make every later FreshInstallIfNotExists() call think installation already succeeded, so the install is never retried automatically.

Suggested fix
 	if err := t.installTemplatesAt(config.DefaultConfig.TemplatesDirectory); err != nil {
 		if isRateLimitError(err) {
+			_ = os.RemoveAll(config.DefaultConfig.TemplatesDirectory)
 			gologger.Warning().Msgf("GitHub API rate limit hit while installing templates: %v", err)
 			gologger.Warning().Msgf("Run again later or set GITHUB_TOKEN env variable for higher rate limits")
-			return nil
+			return errkit.Wrap(err, "template installation deferred by GitHub API rate limit")
 		}
 		return errkit.Wrapf(err, "failed to install templates at %s", config.DefaultConfig.TemplatesDirectory)
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if isRateLimitError(err) {
gologger.Warning().Msgf("GitHub API rate limit hit while installing templates: %v", err)
gologger.Warning().Msgf("Run again later or set GITHUB_TOKEN env variable for higher rate limits")
return nil
}
if isRateLimitError(err) {
_ = os.RemoveAll(config.DefaultConfig.TemplatesDirectory)
gologger.Warning().Msgf("GitHub API rate limit hit while installing templates: %v", err)
gologger.Warning().Msgf("Run again later or set GITHUB_TOKEN env variable for higher rate limits")
return errkit.Wrap(err, "template installation deferred by GitHub API rate limit")
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/installer/template.go` around lines 94 - 98, The early return in
installTemplatesAt when isRateLimitError(err) is true leaves the pre-created
TemplatesDirectory on disk, causing FreshInstallIfNotExists to believe
installation succeeded; modify the isRateLimitError branch inside
installTemplatesAt to remove the partially created TemplatesDirectory (e.g.,
call os.RemoveAll on the TemplatesDirectory path or otherwise mark it
incomplete) before returning, and log any cleanup error via
gologger.Warning/Error; ensure the same cleanup behavior is applied to other
early-return error paths in installTemplatesAt so FreshInstallIfNotExists will
retry installation later.

Comment on lines +216 to +228
// Check for unresolved template variables BEFORE they get encoded by helper
// functions (e.g. base64). Once encoded, ContainsUnresolvedVariables cannot
// detect them in the final dumped request. See: github.com/projectdiscovery/nuclei/issues/7032
if !r.request.SkipVariablesCheck {
for varName, varValue := range variablesMap {
varStr := types.ToString(varValue)
if varErr := expressions.ContainsUnresolvedVariables(varStr); varErr != nil {
gologger.Warning().Msgf("[%s] Template variable \"%s\" has unresolved references: %v (use -var %s=<value>)\n",
r.request.options.TemplateID, varName, varErr, varName)
return nil, ErrMissingVars
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

This validation runs before constants are available.

At Line 230, r.options.Constants are merged only after this loop. That means variables backed by preprocessors/constants — e.g. values containing {{randstr}} — will now be flagged as unresolved here even though they would resolve in the normal allVars evaluation path.

Suggested fix
-	// Check for unresolved template variables BEFORE they get encoded by helper
-	// functions (e.g. base64). Once encoded, ContainsUnresolvedVariables cannot
-	// detect them in the final dumped request. See: github.com/projectdiscovery/nuclei/issues/7032
-	if !r.request.SkipVariablesCheck {
-		for varName, varValue := range variablesMap {
-			varStr := types.ToString(varValue)
-			if varErr := expressions.ContainsUnresolvedVariables(varStr); varErr != nil {
-				gologger.Warning().Msgf("[%s] Template variable \"%s\" has unresolved references: %v (use -var %s=<value>)\n",
-					r.request.options.TemplateID, varName, varErr, varName)
-				return nil, ErrMissingVars
-			}
-		}
-	}
-	// allVars contains all variables from all sources
-	allVars := generators.MergeMaps(dynamicValues, defaultReqVars, optionVars, variablesMap, r.options.Constants)
+	// allVars contains all variables from all sources
+	allVars := generators.MergeMaps(dynamicValues, defaultReqVars, optionVars, variablesMap, r.options.Constants)
+	// Check for unresolved template variables before helper functions encode them,
+	// but after constants/preprocessors are available.
+	if !r.request.SkipVariablesCheck {
+		for varName, varValue := range variablesMap {
+			varStr := types.ToString(varValue)
+			if evaluated, evalErr := expressions.Evaluate(varStr, allVars); evalErr == nil {
+				varStr = evaluated
+			}
+			if varErr := expressions.ContainsUnresolvedVariables(varStr); varErr != nil {
+				gologger.Warning().Msgf("[%s] Template variable \"%s\" has unresolved references: %v (use -var %s=<value>)\n",
+					r.request.options.TemplateID, varName, varErr, varName)
+				return nil, ErrMissingVars
+			}
+		}
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/protocols/http/build_request.go` around lines 216 - 228, The
unresolved-variable check runs before constants are merged, causing
preprocessor-backed vars to be falsely flagged; move or repeat the merge of
r.request.options.Constants into the variables map (or build the full allVars
set) prior to running the loop that calls
expressions.ContainsUnresolvedVariables on variablesMap; specifically ensure the
constants from r.request.options.Constants (or r.options.Constants if that name
is used) are applied to variablesMap/allVars before the check in
build_request.go so that variables like {{randstr}} are resolved when inspected,
and keep the existing SkipVariablesCheck guard (r.request.SkipVariablesCheck)
intact.

Comment on lines +23 to +28
// Prefix with the full URL so scheme (http vs https) is part of
// the project-file cache key and responses are not shared across
// different schemes for the same host.
if fullURL := req.request.String(); fullURL != "" {
bin = append([]byte(fullURL+"\n"), bin...)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, find the utils.go file and read the dump() function
find . -name "utils.go" -path "*/pkg/protocols/http/*" | head -5

Repository: projectdiscovery/nuclei

Length of output: 96


🏁 Script executed:

# Read the dump() function from utils.go to see the full context
if [ -f "pkg/protocols/http/utils.go" ]; then
  wc -l pkg/protocols/http/utils.go
fi

Repository: projectdiscovery/nuclei

Length of output: 97


🏁 Script executed:

# Get the full dump() function to understand the context
if [ -f "pkg/protocols/http/utils.go" ]; then
  grep -n "func dump" pkg/protocols/http/utils.go
fi

Repository: projectdiscovery/nuclei

Length of output: 135


🏁 Script executed:

# Read the entire utils.go file to see the dump() function in full context
cat -n pkg/protocols/http/utils.go

Repository: projectdiscovery/nuclei

Length of output: 1854


🏁 Script executed:

# Now let me verify the claim about rawhttp.DumpRequestRaw behavior
# Search for rawhttp usage and check if there's any documentation or tests
rg "DumpRequestRaw" --type go -A 3 -B 3

Repository: projectdiscovery/nuclei

Length of output: 711


🏁 Script executed:

# Check if there are any tests or examples showing what DumpRequestRaw returns
find . -name "*.go" -type f | xargs grep -l "DumpRequestRaw" | head -10

Repository: projectdiscovery/nuclei

Length of output: 96


🏁 Script executed:

# Let's check the go.mod to see what version of rawhttp is used
if [ -f "go.mod" ]; then
  grep "rawhttp" go.mod
fi

Repository: projectdiscovery/nuclei

Length of output: 111


🏁 Script executed:

# Search for any tests or usage of the dump function to understand expected behavior
rg "func dump" -A 20 pkg/protocols/http/

Repository: projectdiscovery/nuclei

Length of output: 3452


🏁 Script executed:

# Check if there are integration tests or examples that show the behavior
find . -name "*_test.go" -path "*/protocols/http/*" -type f

Repository: projectdiscovery/nuclei

Length of output: 389


🏁 Script executed:

# Check the tests to see if there's verification of caching behavior
grep -n "dump\|cache\|project" pkg/protocols/http/build_request_test.go | head -20

Repository: projectdiscovery/nuclei

Length of output: 639


🏁 Script executed:

# Let's search for where dump() is called to understand how it's used for caching
rg "dump\(" --type go -B 2 -A 2 | grep -v test | head -30

Repository: projectdiscovery/nuclei

Length of output: 1816


🏁 Script executed:

# Search for any cache-related code that uses the dump function
rg "cache" -i --type go pkg/protocols/http/ | head -20

Repository: projectdiscovery/nuclei

Length of output: 2222


🏁 Script executed:

# Let's look at where dumpedRequest is used - specifically for caching
rg "dumpedRequest" --type go -B 3 -A 3

Repository: projectdiscovery/nuclei

Length of output: 7724


🏁 Script executed:

# Search for project-file or caching logic that uses this dump
rg "project\|Project" --type go pkg/protocols/http/ | grep -i cache | head -10

Repository: projectdiscovery/nuclei

Length of output: 49


🏁 Script executed:

# Let me check where dumpedRequest is stored/used for cache keys
grep -n "dumpedRequest" pkg/protocols/http/request.go

Repository: projectdiscovery/nuclei

Length of output: 1771


🏁 Script executed:

# Now let's verify the key claim: does rawhttp.DumpRequestRaw include scheme info?
# Let me check if there are any examples or tests that show what rawhttp.DumpRequestRaw returns
rg "rawhttp\\.DumpRequestRaw" -B 5 -A 5 --type go | head -40

Repository: projectdiscovery/nuclei

Length of output: 892


🏁 Script executed:

# Let me search for the rawhttp package source or documentation
# First check if it's vendored or imported
find . -path "*/rawhttp*" -type d 2>/dev/null | head -10

Repository: projectdiscovery/nuclei

Length of output: 49


🏁 Script executed:

# Let me check the go.mod file more carefully to understand the dependency
cat go.mod | grep -A 2 -B 2 "rawhttp"

Repository: projectdiscovery/nuclei

Length of output: 305


🏁 Script executed:

# Let me check if there are any tests that show the output of dump() for raw requests
rg "rawhttp\|DumpRequestRaw" -B 10 -A 10 --type go pkg/protocols/http/raw/ 2>/dev/null | head -50

Repository: projectdiscovery/nuclei

Length of output: 49


🏁 Script executed:

# Check if there are tests related to project caching or dump functionality
find . -name "*.go" -path "*/http/*" -exec grep -l "ProjectFile\|dump" {} \; | head -10

Repository: projectdiscovery/nuclei

Length of output: 203


🏁 Script executed:

# Let me look at the raw request tests more specifically
if [ -f "pkg/protocols/http/raw/raw_test.go" ]; then
  head -100 pkg/protocols/http/raw/raw_test.go
fi

Repository: projectdiscovery/nuclei

Length of output: 5027


🏁 Script executed:

# Let's verify the key claim about DumpRequestRaw by checking the actual function calls and their behavior
# Look at the request.go to see exactly how dumpedRequest is used for caching
sed -n '815p;965p' pkg/protocols/http/request.go

Repository: projectdiscovery/nuclei

Length of output: 231


🏁 Script executed:

# Let me also verify that the code doesn't prefix the raw request branch by looking at lines 30-36 more carefully
sed -n '30,36p' pkg/protocols/http/utils.go

Repository: projectdiscovery/nuclei

Length of output: 514


🏁 Script executed:

# Let me verify the exact issue by understanding what happens in both branches
# Branch 1: retryablehttp (lines 17-29)
echo "=== Branch 1: retryablehttp ==="
sed -n '17,29p' pkg/protocols/http/utils.go

echo ""
echo "=== Branch 2: rawhttp (lines 31-36) ==="
sed -n '31,36p' pkg/protocols/http/utils.go

Repository: projectdiscovery/nuclei

Length of output: 1144


🏁 Script executed:

# Let's check the comment at the top of the dump function to confirm intent
sed -n '12,15p' pkg/protocols/http/utils.go

Repository: projectdiscovery/nuclei

Length of output: 319


The scheme split is only fixed for the retryablehttp branch.

The function is documented to prefix all dumps with the full URL to ensure http:// and https:// requests produce distinct cache keys when used with -project, but this is only implemented for retryablehttp requests (when req.request != nil). The raw-request path returns rawhttp.DumpRequestRaw(...) without prefixing, allowing http and https requests to collide in the project cache.

Suggested fix
 func dump(req *generatedRequest, reqURL string) ([]byte, error) {
+	prefixURL := func(bin []byte, fullURL string) []byte {
+		if fullURL == "" {
+			return bin
+		}
+		return append([]byte(fullURL+"\n"), bin...)
+	}
+
 	if req.request != nil {
 		// Use a clone to avoid a race condition with the http transport
 		bin, err := req.request.Clone(req.request.Context()).Dump()
 		if err != nil {
 			return nil, errkit.Wrapf(err, "could not dump request: %v", req.request.String())
 		}
-		// Prefix with the full URL so scheme (http vs https) is part of
-		// the project-file cache key and responses are not shared across
-		// different schemes for the same host.
-		if fullURL := req.request.String(); fullURL != "" {
-			bin = append([]byte(fullURL+"\n"), bin...)
-		}
-		return bin, nil
+		return prefixURL(bin, req.request.String()), nil
 	}
 	rawHttpOptions := &rawhttp.Options{CustomHeaders: req.rawRequest.UnsafeHeaders, CustomRawBytes: req.rawRequest.UnsafeRawBytes}
 	bin, err := rawhttp.DumpRequestRaw(req.rawRequest.Method, reqURL, req.rawRequest.Path, generators.ExpandMapValues(req.rawRequest.Headers), io.NopCloser(strings.NewReader(req.rawRequest.Data)), rawHttpOptions)
 	if err != nil {
 		return nil, errkit.Wrapf(err, "could not dump request: %v", reqURL)
 	}
-	return bin, nil
+	return prefixURL(bin, reqURL), nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/protocols/http/utils.go` around lines 23 - 28, The raw-request code path
fails to prefix dumps with the full URL, causing http/https cache key
collisions; update the logic in the function that builds `bin` so both branches
prepend the full URL (as done with `if fullURL := req.request.String(); fullURL
!= "" { bin = append([]byte(fullURL+"\n"), bin...) }`) to the bytes produced by
`rawhttp.DumpRequestRaw(...)` (and any other dump path) before returning; locate
uses of `req.request`, `rawhttp.DumpRequestRaw`, and the `bin` variable and
ensure the same fullURL prefixing is applied for raw requests as for
retryablehttp requests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fatal error: concurrent map writes in MultiPartForm.Decode during multipart/form-data fuzzing

2 participants