Skip to content

DBT'ify pluto staging models#2241

Draft
alexrichey wants to merge 26 commits intomainfrom
ar-dbtify-pluto-staging-models
Draft

DBT'ify pluto staging models#2241
alexrichey wants to merge 26 commits intomainfrom
ar-dbtify-pluto-staging-models

Conversation

@alexrichey
Copy link
Contributor

No description provided.

@codecov
Copy link

codecov bot commented Feb 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.69%. Comparing base (394764f) to head (026e50b).
⚠️ Report is 6 commits behind head on main.

Additional details and impacted files

see 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@alexrichey alexrichey force-pushed the ar-dbtify-pluto-staging-models branch 3 times, most recently from e6cb9d5 to eb97be2 Compare February 24, 2026 18:19
all our scripts are expecting the command with the prefix, so...
I keep mine outside the data engineering dir
See comment for rationale
- Created 4 additional staging models:
  * stg__pluto_input_research.sql
  * stg__pluto_pts.sql
  * stg__dcp_mappluto.sql
  * stg__previous_pluto.sql
- Fixed remaining non-stg__ references in 13 SQL files
- All source tables now consistently use staging models
- Total staging models: 40 (up from 36)
- Created 01a_dbt_staging.sh to run dbt staging models
- Script runs between data load and legacy SQL build
- Materializes 40 staging models before 02_build.sh runs
- Added pluto_build/README.md documenting build sequence
- Legacy SQL can now reference stg__ tables
- Moved 9 CSV files from pluto_build/data/ to seeds/
- Configured seeds in dbt_project.yml (+quote_columns, +schema: public)
- Documented all seeds in seeds/_seeds.yml
- Updated 01a_dbt_staging.sh to run 'dbt seed' before staging models
- Deleted 01_load_local_csvs.sh (replaced by dbt seed)
- Deleted sql/_create.sql (replaced by dbt seed)
- Updated README.md with seed documentation
- No SQL changes needed - seeds create same table names
- Update GitHub workflow to call 01a_dbt_staging.sh instead of removed 01_load_local_csvs.sh
- Remove duplicate dbt seed call from 07_custom_qaqc.sh to avoid reloading seeds
- Seeds are now loaded exactly once via 01a_dbt_staging.sh

Closes data-engineering-n58.3
- Add --profiles-dir . to all dbt commands in 01a_dbt_staging.sh and 07_custom_qaqc.sh
- Move 'cd ..' before dbt deps/debug in 01a_dbt_staging.sh
- Fix schema config deprecation in dbt_project.yml (add + prefix to tests.schema)
- Ensures dbt uses local profiles.yml in GHA workflows
- Removed duplicate pluto_pts entry (was in recipe_sources twice)
- Removed duplicate dcp_zoningdistricts entry (was in recipe_sources and build_sources)

Fixes dbt compilation error about duplicate source names
- Add column_types config for ignored_bbls_for_unit_count_test (bbl, pluto_version as text)
- Add column_types config for pluto_input_research (bbl as text)
- Add column_types config for pluto_input_condolot_descriptiveattributes (condno, parid as text)
- Remove incorrect column_types from zoning_district_class_descriptions
- Fixes 'integer out of range' errors when loading BBL values
- Change condno -> CondNO and parid -> PARID to match CSV header
- Fixes integer out of range error
- Change seeds schema from 'public' to BUILD_ENGINE_SCHEMA to match build scripts
- Update stg__pluto_input_research to reference seed with ref() instead of source()
- Ensures build scripts can find seed tables in the correct schema
- Seeds were loading to doubled schema (target_schema + custom_schema)
- dbt automatically uses BUILD_ENGINE_SCHEMA from profiles.yml as target
- Removing +schema config fixes: ar_dbtify_pluto_staging_models_ar_dbtify_pluto_staging_models -> ar_dbtify_pluto_staging_models
- Matches green_fast_track pattern

Closes data-engineering-n58.5
@alexrichey alexrichey force-pushed the ar-dbtify-pluto-staging-models branch from 11b8e89 to 26e2b59 Compare February 26, 2026 16:02
@alexrichey alexrichey force-pushed the ar-dbtify-pluto-staging-models branch from dc48cde to 27253fb Compare March 2, 2026 21:47
- Use SCRIPT_DIR to find bash/config.sh relative to script location
- Remove 'cd ..' and 'cd pluto_build' navigation
- Fix column name case in _seeds.yml (CondNO -> condno, PARID -> parid)
- Script now runs successfully from products/pluto directory
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant