Switch batch inference to Feature Store offline store#28
Merged
Conversation
Read from canonical feature table (iris_features) with server-side source='batch_input' filter instead of raw BQ tables. Remove the conditional column rename hack — canonical table has consistent names regardless of data source. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Append -training and -inference to PIPELINE_NAME in each pipeline file so they show as distinct pipelines in Vertex AI (e.g. pipeline-iris-staging-training, pipeline-iris-staging-inference). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
list_models returned models in creation order and [0] grabbed the first (oldest) version — trained with CamelCase columns before the feature store migration. Sort by create_time descending so [0] is the most recently registered model. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Drop CamelCase aliases and ConfigDict — field names match the feature platform directly. No backward compat needed since the model is retrained on canonical names. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Also fix sepal_width_cm type from integer to number to match the other feature fields. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Loads the latest registered model from GCS and checks that feature_names_in_ matches the canonical names from the feature store. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Prevents stale bytecache or non-editable installs from causing KFP to serialize old component code into pipeline YAML. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The upper bound <3.11 excluded the local Python 3.11.0, blocking editable installs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
list_models returns parent model entries, not versions. Use list_model_versions to get all versions of the model, then sort by create_time to pick the latest one. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
register.py already sets version_aliases=['blessed'] on each uploaded model. Use get_model(name + '@blessed') to directly fetch the blessed version instead of listing all versions and sorting. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Owner
Author
|
Successful run - Data loaded (50 rows) - /gcs/sb-vertex/staging/pipeline_root/57434141298/pipeline-iris-staging-inference-20260616115218/get-model_4592835321065897984/latest_model |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
iris_features) instead of raw BQ tables (iris/iris_pubsub_data)source = 'batch_input'server-side in SQL — only scores unlabeled inference datasepal_length_cm, etc.) matching the retrained modelBigQueryClient(project=project_id)for query job permissions (same pattern as Step 6)What was removed
The old inference component had a brittle
if bq_table == "iris_pubsub_data"branch that renamed snake_case Pub/Sub columns to CamelCase. With the feature store, all data flows throughingest.pyintoiris_featureswith canonical names — no conditional renaming needed.Prerequisites
ingest.pymust be run to populateiris_featureswith both training and batch_input rowsbq_dataloader.py --generate-random Nmust be run first to create batch_input dataTest plan
bq_dataloader.py --generate-random 20theningest.pyto populate feature tablebatch_inputrowsiris_predictionstable🤖 Generated with Claude Code