feat(openapi): upgrade OpenAPI FDW to v0.2.0 with modular architecture#573
feat(openapi): upgrade OpenAPI FDW to v0.2.0 with modular architecture#573codybrom wants to merge 9 commits intosupabase:mainfrom
Conversation
Replace v0.1.4 monolithic codebase with v0.2.0 refactored modules: config, request, response, pagination, column_matching, spec, schema. New features: POST-for-read endpoints, spec_json inline specs, LIMIT-to-page_size pushdown, api_key_location (query/cookie), debug mode, max_pages/max_response_bytes safety limits, OpenAPI 3.1 support. Includes 518 unit tests, benchmarks, 5 real-world examples (NWS, CarAPI, PokeAPI, GitHub, Threads), Docker-based integration test infrastructure with 113 assertions, and performance analysis docs.
Rewrite README with clearer features list, honest performance section comparing FDW vs pg_http (DX tradeoff with SQL examples and end-to-end benchmarks), and move limitations up for visibility. Consolidate PERFORMANCE.md into README. Update benchmark script to measure full read-to-write lifecycle (INSERT INTO) instead of PERFORM. Fix tabbed content indentation in catalog docs for pymdownx.tabbed rendering.
|
No actionable comments were generated in the recent review. 🎉 📝 WalkthroughSummary by CodeRabbit
WalkthroughThis PR releases OpenAPI FDW v0.2.0. It adds POST-for-Read, inline spec import via Sequence Diagram(s)sequenceDiagram
participant Postgres as Postgres (FDW caller)
participant Wasm as WASM_FDW
participant SpecSrc as Spec_Source (spec_url / spec_json)
participant API as External_API
Postgres->>Wasm: begin_scan(query, quals, LIMIT, options)
alt spec not loaded
Wasm->>SpecSrc: fetch_spec()
SpecSrc-->>Wasm: OpenAPI spec (JSON/YAML)
end
Wasm->>Wasm: build_request(endpoint, method, substitute_path_params, build_query_params, inject_auth)
Wasm->>API: HTTP request(url, headers, body)
API-->>Wasm: HTTP response (status, headers, body)
Wasm->>Wasm: extract_data(response, response_path, wrapper_keys)
Wasm->>Wasm: handle_pagination(detect next token, update PaginationState)
alt more pages & within limits
loop fetch next pages
Wasm->>API: HTTP request(next_url or cursor)
API-->>Wasm: HTTP response
Wasm->>Wasm: extract_data(...)
Wasm->>Wasm: detect_loop / check max_pages / max_response_bytes
end
end
Wasm->>Postgres: return rows (typed, attrs)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Comment |
Remove [profile.release] (strip, lto) from the shared wasm-wrappers workspace Cargo.toml — these affect all wasm FDWs, not just openapi. Revert Cargo.lock to match main. Minor README updates.
Remove benches/fdw_benchmarks.rs and the criterion dev-dependency. These benchmarks tested re-implemented copies of FDW logic rather than actual code, added ~38 transitive dependencies, and caused build errors on wasm targets. The SQL-level benchmark script (test/benchmark.sh) provides meaningful end-to-end performance analysis.
Unit tests and clippy can't run on wasm32-unknown-unknown since there's no runtime to execute the binary. Auto-detect the host target via rustc so make test and make clippy work out of the box on any platform.
…vements Add YAML spec parsing via serde_yaml_ng so spec_url accepts both JSON and YAML OpenAPI specs. Many APIs only publish YAML, so this makes the FDW work out of the box with more APIs. Also addresses PR review items: - Replace deprecated serde_yaml with serde_yaml_ng - debug_assert! -> assert! in this_mut() for release safety - Header deduplication prevents duplicate content-type/authorization - Empty/whitespace credentials filtered with warning - Retry on 502/503 in addition to 429, with status-specific hints - RowsOut stats now count rows consumed by PG, not just fetched - Validate max_pages >= 1 - base_url validation for spec-derived server URLs - Improved error messages (show both JSON and YAML parse errors) Example updates: - All 5 examples get IMPORT FOREIGN SCHEMA as section 1 - New import servers with spec_url (or spec_json for Threads) - Threads example shows CREATE SERVER with inline spec_json - PokeAPI highlights YAML spec support
There was a problem hiding this comment.
Pull request overview
Upgrades the OpenAPI FDW to v0.2.0, restructuring it into focused modules and expanding capabilities (YAML/inline specs, POST-for-read, improved pagination/limit handling, retries, and richer examples/docs) to make the wrapper easier to maintain and more robust against real-world OpenAPI APIs.
Changes:
- Bumps package/versioning to
0.2.0and removes the unused JWT WIT import. - Adds modular Rust implementation for config/request/response/pagination/schema/column-matching, plus extensive unit + docker-based test assets.
- Adds/updates catalog docs and several real-world API examples (NWS, GitHub, Threads, CarAPI, PokéAPI).
Reviewed changes
Copilot reviewed 40 out of 42 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| wasm-wrappers/fdw/openapi_fdw/wit/world.wit | Version bump to 0.2.0; removes JWT import. |
| wasm-wrappers/fdw/openapi_fdw/test/mock-spec.json | Adds comprehensive mock OpenAPI spec for integration/edge testing. |
| wasm-wrappers/fdw/openapi_fdw/test/docker-compose.yml | Adds Docker setup for integration tests (MockServer + Postgres). |
| wasm-wrappers/fdw/openapi_fdw/test/benchmark.sh | Adds benchmark script comparing FDW vs pg_http/pg_net. |
| wasm-wrappers/fdw/openapi_fdw/test/.env.example | Adds env template for authenticated examples/tests. |
| wasm-wrappers/fdw/openapi_fdw/src/schema.rs | Updates type mapping, column extraction, name sanitization, DDL generation, and test module wiring. |
| wasm-wrappers/fdw/openapi_fdw/src/response.rs | Adds response parsing + wrapper-key detection + pagination token extraction. |
| wasm-wrappers/fdw/openapi_fdw/src/request_tests.rs | Adds extensive tests for URL building, pagination URL resolution, security, retries, and spec_json loading. |
| wasm-wrappers/fdw/openapi_fdw/src/request.rs | Adds spec fetch (JSON/YAML), URL building, param injection, retries, debug logging, and safety guards. |
| wasm-wrappers/fdw/openapi_fdw/src/pagination_tests.rs | Adds thorough unit tests for pagination state/loop detection/limits. |
| wasm-wrappers/fdw/openapi_fdw/src/pagination.rs | Adds pagination state machine (cursor/url tokens, loop detection, limits). |
| wasm-wrappers/fdw/openapi_fdw/src/lib_tests.rs | Adds cross-cutting tests for defaults + helpers (validate_url, option parsing, etc.). |
| wasm-wrappers/fdw/openapi_fdw/src/config.rs | Adds server config parsing, auth/header setup, debug-safe config formatting, pagination defaults save/restore. |
| wasm-wrappers/fdw/openapi_fdw/src/column_matching.rs | Adds cached column metadata, key matching strategies, coercions, and injection behavior. |
| wasm-wrappers/fdw/openapi_fdw/examples/threads/init.sql | Adds Threads end-to-end setup incl. inline spec usage and multiple tables. |
| wasm-wrappers/fdw/openapi_fdw/examples/threads/README.md | Documents Threads example, inline spec, pagination, and features. |
| wasm-wrappers/fdw/openapi_fdw/examples/pokeapi/init.sql | Adds PokéAPI example (YAML spec, pagination, path params). |
| wasm-wrappers/fdw/openapi_fdw/examples/pokeapi/README.md | Documents PokéAPI example and expected usage patterns. |
| wasm-wrappers/fdw/openapi_fdw/examples/nws/init.sql | Adds NWS example (GeoJSON extraction, cursor pagination, path params). |
| wasm-wrappers/fdw/openapi_fdw/examples/nws/README.md | Documents NWS example and advanced FDW options. |
| wasm-wrappers/fdw/openapi_fdw/examples/github/init.sql | Adds GitHub example (auth headers, import, pagination, search). |
| wasm-wrappers/fdw/openapi_fdw/examples/github/README.md | Documents GitHub example and recommended configuration. |
| wasm-wrappers/fdw/openapi_fdw/examples/carapi/init.sql | Adds CarAPI example (page-based pagination and query pushdown). |
| wasm-wrappers/fdw/openapi_fdw/examples/carapi/README.md | Documents CarAPI example and feature coverage. |
| wasm-wrappers/fdw/openapi_fdw/examples/README.md | Adds index of examples by auth requirement and key features. |
| wasm-wrappers/fdw/openapi_fdw/README.md | Major README refresh (features, perf notes, tests, examples, changelog). |
| wasm-wrappers/fdw/openapi_fdw/Makefile | Adds local developer targets (fmt/clippy/test/build/check). |
| wasm-wrappers/fdw/openapi_fdw/Cargo.toml | Version bump to 0.2.0; adds YAML parsing dependency. |
| wasm-wrappers/fdw/openapi_fdw/.gitignore | Ignores test/.env. |
| docs/catalog/openapi.md | Updates catalog docs for new options/features and new version entry. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@wasm-wrappers/fdw/openapi_fdw/src/request.rs`:
- Around line 182-188: The branch that handles absolute pagination paths
concatenates self.config.base_url with next_url, causing duplicate path prefixes
when base_url contains a path; update the logic in request.rs (the code path
that builds pagination URLs where next_url.starts_with('/') is checked) to parse
self.config.base_url, extract the origin (scheme + host + optional port) and
join that origin with next_url instead of appending to the full base_url; use
the Url parsing utilities you already use elsewhere (e.g., Url::parse) to build
the new URL and fall back to the current behavior only on parse errors, and add
a unit test that sets base_url to include a path (like
https://api.example.com/v1) and verifies a next_url of /v1/items resolves to
https://api.example.com/v1/items (i.e., no duplicated /v1).
…d CodeRabbit
- Fix pagination URL resolution for parameterized endpoints by storing
resolved_endpoint after path param substitution
- Fix absolute-path pagination to use origin-only base to avoid
duplicating path prefixes (e.g. /v1/v1/items)
- Map time and byte/binary formats to text (WIT TypeOid has no
time/bytea variants)
- Fix .env.example copy instructions and README test counts
- Replace {checksum} placeholder with descriptive text in catalog docs
|
Hi @burmecia, this is ready for review. All CI checks are passing (including wasm integration tests) and I've addressed the Copilot and CodeRabbit feedback in the latest commit. The main changes from v0.1.x (which you already approved in #566) are the modular refactor for maintainability and a few new capabilities: YAML/inline specs, POST-for-read, LIMIT pushdown, and OpenAPI 3.1 support. The core FDW interface and table options are backwards compatible. Happy to walk through anything or make changes. Let me know! |
…er blocks Move per-example feature tables into a single Feature Coverage comparison on the main examples README. Add the missing CREATE SERVER blocks for spec_url/spec_json import servers to carapi, pokeapi, github, and nws examples. Rename NWS references to Weather.gov.
The attrs column returns the full JSON response object, not just unmapped fields. Updated all example READMEs to accurately describe this, consistent with the catalog docs and every other wasm FDW.
This is the v0.2.0 upgrade to the OpenAPI FDW, following the initial v0.1.x that landed in #566.
The biggest change is structural. The single ~1100-line
lib.rshas been broken into focused modules (config,request,response,pagination,column_matching,spec, andschema), each with co-located tests. The original was doing too many things in one file (parsing specs, making HTTP requests, matching columns, handling pagination, etc.), which made it hard to follow and harder to maintain. Splitting it up makes each piece easier to test and understand on its own.What's new
YAML spec support
OpenAPI specs published as YAML are now parsed automatically.
spec_urlaccepts both JSON and YAML specs — the FDW tries JSON first, then falls back to YAML. This matters because some APIs only publish YAML specs.Real-world examples
Five working examples showing how to use the FDW against real APIs: NWS Weather, CarAPI, PokeAPI, GitHub, and Threads. Each has a README walkthrough and an
init.sqlyou can run directly, includingIMPORT FOREIGN SCHEMAquick-start sections.POST endpoints now work for reads
v0.1 only supported GET endpoints. v0.2 adds POST-for-read, so APIs that use POST for search or filter operations (set
method 'POST'andrequest_bodyon the table) can now work too.Inline spec support
You can now embed an OpenAPI spec directly in the server options with
spec_jsoninstead of pointing to a URL. This is useful for APIs without a public OpenAPI spec (like Threads) or for customizing a subset of endpoints. The Threads example demonstrates this with a hand-written inline spec.Smarter LIMIT handling and safety guards
On the query side,
LIMITis now forwarded to the API's page-size parameter, soSELECT * FROM users LIMIT 10won't fetch 100 rows just to throw away 90. There are also safety limits (max_pages,max_response_bytes) to prevent runaway pagination from burning through API quotas.Automatic retries for transient errors
The FDW now retries on HTTP 429 (rate limit), 502 (bad gateway), and 503 (service unavailable) with exponential backoff, up to 3 retries. The 429 hint suggests reducing page size; 502/503 retries are silent.
Everything else
api_key_locationserver optiondebug 'true'server option that logs request/response details as PostgreSQL INFO messages[type, "null"]nullable syntax, etc.)content-typeorauthorizationheaders when custom headers overlap defaultsRowsOutstats now accurately count rows consumed by PostgreSQL (not just rows fetched)docs/catalog/openapi.md) are updated for the new options but the structure follows the existing patternTesting
531 unit tests across all modules, plus a Docker-based integration test setup (
test/run.sh) with a mock API server and 113 assertions covering pagination, path parameters, error handling, and type coercion.Five new real-world examples are included (Weather.gov, Threads, GitHub, CarAPI and PokeAPI). These double as documentation and a way to validate the FDW against actual APIs.
Test plan
cargo testpasses (531 unit tests, native target)cargo component build --release --target wasm32-unknown-unknownproduces valid wasmcargo fmt --checkandcargo clippycleantest/run.sh)