-
Notifications
You must be signed in to change notification settings - Fork 0
Fix double-encoding of PostgreSQL json/jsonb columns; add step.json_parse #297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
73cede6
Initial plan
Copilot 75e898a
Fix json/jsonb double-encoding and add step.json_parse
Copilot 50bca53
Address reviewer feedback: shared helper, pre-check, nil-check on source
Copilot bbaef0b
Merge branch 'main' into copilot/fix-double-encoding-jsonb-types
intel352 af59a0d
Merge branch 'main' into copilot/fix-double-encoding-jsonb-types
intel352 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,41 @@ | ||
| package module | ||
|
|
||
| import ( | ||
| "bytes" | ||
| "encoding/json" | ||
| ) | ||
|
|
||
| // parseJSONBytesOrString attempts to unmarshal b as JSON. If successful the | ||
| // parsed Go value is returned (map[string]any, []any, string, float64, bool, | ||
| // or nil). This transparently handles PostgreSQL json/jsonb columns, which the | ||
| // pgx driver delivers as raw JSON bytes rather than pre-typed Go values. | ||
| // | ||
| // A cheap leading-byte pre-check is applied first so that binary blobs (e.g. | ||
| // PostgreSQL bytea) skip the full JSON parser entirely and fall back to | ||
| // string conversion without incurring unnecessary CPU overhead. | ||
| // | ||
| // If b is not valid JSON (e.g. PostgreSQL bytea binary data), string(b) is | ||
| // returned so that the existing string-fallback behaviour is preserved. | ||
| func parseJSONBytesOrString(b []byte) any { | ||
| if len(b) == 0 { | ||
| return string(b) | ||
| } | ||
| // Quick check: JSON must start with one of these characters (after optional | ||
| // whitespace). Anything else is definitely not JSON and we avoid calling the | ||
| // full decoder on large binary blobs. | ||
| trimmed := bytes.TrimLeft(b, " \t\r\n") | ||
| if len(trimmed) == 0 { | ||
| return string(b) | ||
| } | ||
| first := trimmed[0] | ||
| if first != '{' && first != '[' && first != '"' && | ||
| first != 't' && first != 'f' && first != 'n' && | ||
| first != '-' && (first < '0' || first > '9') { | ||
| return string(b) | ||
| } | ||
| var v any | ||
| if err := json.Unmarshal(b, &v); err == nil { | ||
| return v | ||
| } | ||
| return string(b) | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,82 @@ | ||
| package module | ||
|
|
||
| import ( | ||
| "context" | ||
| "encoding/json" | ||
| "fmt" | ||
|
|
||
| "github.com/GoCodeAlone/modular" | ||
| ) | ||
|
|
||
| // JSONParseStep parses a JSON string value from the pipeline context into a | ||
| // structured Go value (map, slice, etc.) and stores the result as step output. | ||
| // | ||
| // This is useful when a pipeline step (e.g. step.db_query against a legacy | ||
| // driver, or step.http_call) returns a JSON column/field as a raw string rather | ||
| // than as a pre-parsed Go type. It is the explicit counterpart to the automatic | ||
| // json/jsonb detection that step.db_query performs for the pgx driver. | ||
| // | ||
| // Configuration: | ||
| // | ||
| // source: "steps.fetch.row.json_column" # dot-path to the JSON string value (required) | ||
| // target: "parsed_data" # output key name (optional, defaults to "value") | ||
| type JSONParseStep struct { | ||
| name string | ||
| source string | ||
| target string | ||
| } | ||
|
|
||
| // NewJSONParseStepFactory returns a StepFactory that creates JSONParseStep instances. | ||
| func NewJSONParseStepFactory() StepFactory { | ||
| return func(name string, config map[string]any, _ modular.Application) (PipelineStep, error) { | ||
| source, _ := config["source"].(string) | ||
| if source == "" { | ||
| return nil, fmt.Errorf("json_parse step %q: 'source' is required", name) | ||
| } | ||
|
|
||
| target, _ := config["target"].(string) | ||
| if target == "" { | ||
| target = "value" | ||
| } | ||
|
|
||
| return &JSONParseStep{ | ||
| name: name, | ||
| source: source, | ||
| target: target, | ||
| }, nil | ||
| } | ||
| } | ||
|
|
||
| // Name returns the step name. | ||
| func (s *JSONParseStep) Name() string { return s.name } | ||
|
|
||
| // Execute resolves the source path, parses the value as JSON if it is a string, | ||
| // and stores the result under the configured target key. | ||
| func (s *JSONParseStep) Execute(_ context.Context, pc *PipelineContext) (*StepResult, error) { | ||
| raw := resolveBodyFrom(s.source, pc) | ||
| if raw == nil { | ||
| return nil, fmt.Errorf("json_parse step %q: source %q not found or resolved to nil", s.name, s.source) | ||
| } | ||
|
|
||
| var parsed any | ||
| switch v := raw.(type) { | ||
| case string: | ||
| if err := json.Unmarshal([]byte(v), &parsed); err != nil { | ||
| return nil, fmt.Errorf("json_parse step %q: failed to parse JSON from %q: %w", s.name, s.source, err) | ||
| } | ||
| case []byte: | ||
| if err := json.Unmarshal(v, &parsed); err != nil { | ||
| return nil, fmt.Errorf("json_parse step %q: failed to parse JSON bytes from %q: %w", s.name, s.source, err) | ||
| } | ||
| default: | ||
| // Value is already a structured type (map, slice, number, bool, nil). | ||
| // Pass it through unchanged so that pipelines are idempotent when the | ||
| // upstream step already returns a parsed value (e.g. after the db_query | ||
| // fix lands, json_parse is a no-op for json/jsonb columns). | ||
| parsed = raw | ||
| } | ||
|
|
||
| return &StepResult{Output: map[string]any{ | ||
| s.target: parsed, | ||
| }}, nil | ||
| } | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
json_parseresolves the source viaresolveBodyFrom, which returnsnilfor missing/unresolvable paths. That means a typo insourcewill silently produce{target: nil}and the step will appear to succeed. Consider using a strict resolver (e.g. build a data map likeJQStep.resolveInputand callresolveDottedPath) or otherwise detecting “path not found” and returning an error so misconfigurations fail fast.