fix: cache ensured tables in BigQuery streaming v2 to stop 403 storm by anaselmhamdi · Pull Request #68 · bizon-data/bizon-core

anaselmhamdi · 2026-04-08T14:23:39Z

Summary

BigQueryStreamingV2Destination.load_to_bigquery_via_streaming called self.bq_client.create_table(...) on every flush, only catching Conflict (409). In the helpdesk CDC pipeline, one bizon.stream.iteration routes records to ~15 destination_ids, so every iteration issued ~15 sequential POST /bigquery/v2/projects/{proj}/datasets/cdc/tables requests — blowing past BigQuery's per-table metadata quota (5 ops / 10s) and returning 403 rateLimitExceeded on every one.
The pipeline kept producing data because google-cloud-bigquery's DEFAULT_RETRY silently retries rateLimitExceeded with exponential backoff up to 10 minutes — so data flowed, but every iteration burned wall-clock in retries and every Datadog trace for bizon-streaming was red.
Fix: cache (temp_table_id, schema_fingerprint) pairs on the destination instance. First call ensures (create or schema-reconcile), subsequent calls with the same schema skip the whole block. A genuine schema change (new column in record_schemas) produces a new fingerprint → runs the reconcile path exactly once.
finalize() drops matching cache entries after delete_table in FULL_REFRESH and INCREMENTAL so a same-process re-run recreates the temp table.

No config changes, no migration, no new dependencies. Existing Conflict handling for schema drift is preserved. The SDK's built-in rateLimitExceeded retry remains as defense-in-depth.

Files

bizon/connectors/destinations/bigquery_streaming_v2/src/destination.py

Test plan

uv run ruff format + uv run ruff check clean
uv run pytest tests/connectors/destinations/bigquery_streaming_v2/test_bigquery_streaming_v2.py — 4/4 pass
Deploy to staging helpdesk CDC pipeline; confirm via Datadog: service:bizon-streaming resource_name:"POST /bigquery/v2/projects/gorgias-pipeline-production/datasets/cdc/tables" @http.status_code:403 drops from ~15/iteration to 0 after warmup
Confirm bizon.stream.iteration p50/p95 latency drops (SDK backoff no longer burning wall-clock on 403 retries)
Sanity-check that schema drift (adding a column to record_schemas) still triggers the Conflict → update_table path

🤖 Generated with Claude Code

load_to_bigquery_via_streaming unconditionally called create_table on every flush, so each bizon.stream.iteration issued one POST /tables per destination_id. With ~15 CDC tables routed through one destination, this blew past BigQuery's per-table metadata quota (5 ops / 10s) and returned HTTP 403 rateLimitExceeded. The google-cloud-bigquery SDK silently retried those 403s via DEFAULT_RETRY, so data kept flowing but every iteration burned wall-clock in backoff and flooded Datadog with red spans. Cache (temp_table_id, schema_fingerprint) pairs on the destination instance and short-circuit the create_table/Conflict/schema-reconcile block when we've already ensured a table with that exact schema this process lifetime. Drop matching cache entries in finalize() after delete_table in FULL_REFRESH and INCREMENTAL modes so a same-process re-run recreates the temp table. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

anaselmhamdi requested a review from aballiet April 8, 2026 14:23

anaselmhamdi merged commit aef3739 into main Apr 8, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: cache ensured tables in BigQuery streaming v2 to stop 403 storm#68

fix: cache ensured tables in BigQuery streaming v2 to stop 403 storm#68
anaselmhamdi merged 1 commit into
mainfrom
fix/bq-streaming-v2-create-table-cache

anaselmhamdi commented Apr 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

anaselmhamdi commented Apr 8, 2026

Summary

Files

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant