Conversation
…rsing Co-authored-by: intel352 <77607+intel352@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR extends step.request_parse to correctly parse application/x-www-form-urlencoded request bodies (common for webhook providers like Twilio) and adds tests for the new parsing behavior.
Changes:
- Add Content-Type dispatch to parse
application/x-www-form-urlencodedbodies viaurl.ParseQuery(with single vs multi-value handling). - Cache raw request body bytes into
pc.Metadata["_raw_body"](currently only on the form path). - Add unit tests covering form parsing, multi-value fields, and
Content-Typeparameters (e.g.charset).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| module/pipeline_step_request_parse.go | Adds form-urlencoded parsing branch and attempts to cache raw body bytes. |
| module/pipeline_step_request_parse_test.go | Adds coverage for form-urlencoded parsing and raw body caching for that path. |
| req, _ := pc.Metadata["_http_request"].(*http.Request) | ||
| if req != nil && req.Body != nil { | ||
| bodyBytes, err := io.ReadAll(req.Body) | ||
| if err == nil && len(bodyBytes) > 0 { | ||
| var bodyData map[string]any | ||
| if json.Unmarshal(bodyBytes, &bodyData) == nil { | ||
| output["body"] = bodyData | ||
| ct := req.Header.Get("Content-Type") |
There was a problem hiding this comment.
RequestParseStep always reads from req.Body when parse_body is enabled, but it never checks for an existing pc.Metadata["_raw_body"]. If another step (e.g. step.webhook_verify) has already read and cached the body, req.Body will be at EOF and this step will fail to populate output["body"]. Prefer using the cached _raw_body when present, and only fall back to reading req.Body when it is not cached.
| if strings.EqualFold(ct, "application/x-www-form-urlencoded") { | ||
| pc.Metadata["_raw_body"] = bodyBytes | ||
| if formValues, parseErr := url.ParseQuery(string(bodyBytes)); parseErr == nil { | ||
| bodyData := make(map[string]any) | ||
| for k, v := range formValues { | ||
| if len(v) == 1 { | ||
| bodyData[k] = v[0] | ||
| } else { | ||
| bodyData[k] = v | ||
| } | ||
| } | ||
| output["body"] = bodyData | ||
| } | ||
| } else { | ||
| var bodyData map[string]any | ||
| if json.Unmarshal(bodyBytes, &bodyData) == nil { | ||
| output["body"] = bodyData | ||
| } |
There was a problem hiding this comment.
The raw body is cached into pc.Metadata["_raw_body"] only for application/x-www-form-urlencoded. For JSON (and any other Content-Type), this step still consumes req.Body and leaves _raw_body unset, which can break downstream steps that rely on the raw body cache (including step.webhook_verify / gitlab_parse_webhook patterns). Cache bodyBytes into _raw_body whenever you read it (before Content-Type dispatch), regardless of parsing success.
| // Raw body should be cached in metadata | ||
| rawBody, ok := pc.Metadata["_raw_body"].([]byte) | ||
| if !ok { | ||
| t.Fatal("expected _raw_body in metadata") | ||
| } | ||
| if string(rawBody) != `Body=Hello&From=%2B15551234567&To=%2B15559876543&MessageSid=SM1234` { | ||
| t.Errorf("unexpected _raw_body: %s", rawBody) | ||
| } | ||
| } |
There was a problem hiding this comment.
There’s no test covering the common “body was already read and cached” flow: when pc.Metadata["_raw_body"] is set (e.g. by step.webhook_verify) and req.Body is empty/consumed, request_parse should still parse from the cached bytes. Adding a test for that scenario would prevent regressions and validate the intended raw-body-cache interoperability.
|
@copilot apply changes based on the comments in this thread |
Co-authored-by: intel352 <77607+intel352@users.noreply.github.com>
Applied all three review items in commit
|
step.request_parsewithparse_body: trueonly handled JSON bodies — form-encoded payloads (e.g. Twilio webhooks) were silently dropped and the body stream was consumed, breaking downstream steps likestep.webhook_verify.Changes
;parameters fromContent-Typeand branch onapplication/x-www-form-urlencodedvs. the existing JSON pathurl.ParseQuery()to parse the body; single-value fields are exposed asstring, multi-value fields as[]stringpc.Metadata["_raw_body"]whenever the body is freshly read fromreq.Body, regardless of Content-Type — ensuring downstream steps (e.g. HMAC-SHA1 verification) can always access the original payloadpc.Metadata["_raw_body"]is already populated by a prior step (e.g.step.webhook_verify), the cached bytes are used directly instead of reading the (possibly exhausted)req.BodyExample
Given a Twilio POST body
Body=Hello&From=%2B15551234567&To=%2B15559876543:Produces:
Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.