Skip to content

Fix ModuleNotFoundError for dataflow package on Dataflow workers#32

Open
shlbatra wants to merge 1 commit into
mainfrom
fix/dataflow-module-packaging
Open

Fix ModuleNotFoundError for dataflow package on Dataflow workers#32
shlbatra wants to merge 1 commit into
mainfrom
fix/dataflow-module-packaging

Conversation

@shlbatra

Copy link
Copy Markdown
Owner

Summary

  • Add __init__.py to src/dataflow/, src/dataflow/models/, src/dataflow/utils/ to make them proper Python packages
  • Add src/dataflow and src/feature_store to pyproject.toml wheel packages

Root cause

Dataflow workers install from the built wheel. Only src/ml_pipelines_kfp was included, so from dataflow.models.iris_schema import PubSubIrisMessage failed with ModuleNotFoundError: No module named 'dataflow'. Worked locally because pip install -e . adds src/ to the Python path.

Test plan

  • Redeploy iris_feature_pipeline.py to Dataflow — no more ModuleNotFoundError
  • Verify pip install -e . still works locally

🤖 Generated with Claude Code

Dataflow workers install from the built wheel, which only included
ml_pipelines_kfp. The dataflow package was missing __init__.py files
and was not listed in pyproject.toml, causing ModuleNotFoundError
on workers for imports like dataflow.models.iris_schema.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant