Skip to content

Add maildat columns to default.vw_pin_address#980

Open
wrridgeway wants to merge 11 commits intomasterfrom
941-owndat-vs-maildat
Open

Add maildat columns to default.vw_pin_address#980
wrridgeway wants to merge 11 commits intomasterfrom
941-owndat-vs-maildat

Conversation

@wrridgeway
Copy link
Member

@wrridgeway wrridgeway commented Jan 27, 2026

This PR adds iasworld.maildat to default.vw_pin_address as the new source of mail_ columns. We keep columns from iasworld.owndat, but we rename these columns with an owner_ prefix. This allows us to avoid committing entirely to one table or the other and provide as much information to users as possible while being clear about the limitations of both sources of address data.

The only real nuance here is that maildat is not unique by pin and year when cur = 'Y' and deactivat is null like most iasworld tables. Instead we also need to depend on mailseq and choosing which value of mailseq to privilege in order to make maildat unique by pin and year (we take the max value) boiled down to trying to match up with what's displayed on the treasurer's website. Two example pins are below:

parid taxyr mailseq maddr1
01124000010000 2024 0 665 SW 8TH STREET
01124000010000 2024 1 1475 S BARRINGTON RD
02221170250000 2024 0 935 W. KENILWORTH
02221170250000 2024 1 493 W HELEN RD

Updates have not changed row count, as expected:

with old as (
	select year,
		count(*) as old
	from default.vw_pin_address
	group by year
),
new as (
	select year,
		count(*) as new
	from z_ci_941_owndat_vs_maildat_default.vw_pin_address
	group by year
)
select old.year,
	old.old,
	new.new,
	old.old - new.new as diff
from old
	left join new on old.year = new.year
where old.old != new.new

returns nothing. and mail_address_full is largely non-null:

select year,
	count(*) as new
from z_ci_941_owndat_vs_maildat_default.vw_pin_address
where mail_address_full is not null
group by year
year new
2026 1749105
2025 1749120
2024 1750955
2023 1766525
2022 1767234
2021 1766333
2020 1768219
2019 1767675
2018 1768100
2017 1767909
2016 1767301
2015 1765763
2014 1765389
2013 1765993
2012 1766926
2011 1766397
2010 1764226
2009 1756264
2008 1739854
2007 1711776
2006 1678523
2005 1648889
2004 1625030
2003 1603501
2002 1586382
2001 1571199
2000 1555196
1999 1540553

@wrridgeway wrridgeway self-assigned this Jan 27, 2026
@wrridgeway wrridgeway linked an issue Jan 27, 2026 that may be closed by this pull request
@wrridgeway
Copy link
Member Author

wrridgeway commented Feb 4, 2026

@ccao-jardine Are you okay with the taxpayer/owner dichotomy I've gone with here?

@wrridgeway wrridgeway marked this pull request as ready for review February 4, 2026 15:31
@wrridgeway wrridgeway requested a review from a team as a code owner February 4, 2026 15:31
@ccao-jardine
Copy link
Member

@wrridgeway, yes! This is much clearer, provides frequently requested data, and provides more information while also remaining transparent about ongoing data quality issues.

Copy link
Member

@jeancochrane jeancochrane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great! A few small suggestions, plus one request to clean up the docs. I'll give this a final look once that's done.


data_tests:
- unique_combination_of_columns:
name: iasworld_maildat_unique_by_parid_taxyr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Nitpick, optional] For extra clarity:

Suggested change
name: iasworld_maildat_unique_by_parid_taxyr
name: iasworld_maildat_unique_by_parid_taxyr_mailseq

Comment on lines +289 to +290
- Taxpayer mailing addresses are not necessarily the same as property owner
mailing addresses.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Nitpick, optional] A pointer might help future readers:

Suggested change
- Taxpayer mailing addresses are not necessarily the same as property owner
mailing addresses.
- Taxpayer mailing addresses are not necessarily the same as property owner
mailing addresses. For property owner addresses, see `iasworld.owndat`.

Comment on lines +312 to +313
- Property owner mailing addresses are not necessarily the same as taxpayer
mailing addresses.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Nitpick, optional] Same here:

Suggested change
- Property owner mailing addresses are not necessarily the same as taxpayer
mailing addresses.
- Property owner mailing addresses are not necessarily the same as taxpayer
mailing addresses. For taxpayer addresses, see `iasworld.maildat`.

Comment on lines +202 to +222
data_tests:
- unique_combination_of_columns:
name: iasworld_maildat_unique_by_parid_taxyr
combination_of_columns:
- parid
- taxyr
- mailseq
additional_select_columns:
- column: who
alias: who
agg_func: array_agg
- column: wen
alias: wen
agg_func: array_agg
config: &unique-conditions
where: |
CAST(taxyr AS int) BETWEEN {{ var('data_test_iasworld_year_start') }} AND {{ var('data_test_iasworld_year_end') }}
AND cur = 'Y'
AND deactivat IS NULL
meta:
description: maildat should be unique by parid and taxyr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Thought, non-blocking] Just a flag that this test won't run as part of our weekly data integrity test suite, since tests on iasWorld models get filtered into a manual process that follows the town open/close cycle. I doubt the stakeholders that review our iasWorld tests will pay attention to this error, unfortunately. I don't mind leaving the test in as a no-op for the sake of explaining the table's uniqueness condition to readers, but if we want to be proactively notified of failures, we should think of a way to move it to a downstream view so that it will get picked up by our weekly data integrity tests.


columns:
- name : addrtype
description: Address type code (N, S, F, R, C)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion, required] It seems like many of these columns are shared with owndat, and they already have docs definitions in dbt/models/iasworld/columns.md. Do you mind taking a minute to replace these descriptions with {{ docs() }} calls wherever documentation already exists in columns.md, or wherever a column exists on owndat such that its description can be moved to columns.md?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OWNDAT vs MAILDAT?

3 participants