Add 2024 AVs, exemptions, equalizer, and CPI by jeancochrane · Pull Request #63 · ccao-data/ptaxsim

jeancochrane · 2026-01-02T20:51:43Z

This PR tweaks a few data-raw scripts to add 2024 data to the pin, cpi, and eq_factor tables. I have already used this code to load the corresponding files into the testing bucket on S3.

The most complicated of these changes relates to the pin table, whose data source needs to change in 2024 following the Clerk's migration from the AS400 to iasWorld as their source-of-truth database. Rather than pull AV and exemption data from a SQL server mirror of the AS400, as we used to do, we now pull these data from a flat file stored in S3. In future years, we may pull this data from iasWorld directly, so I did a little bit of QC work to check the flat file against iasWorld; they mostly match up, though there remain a few thousand rows with discrepancies that I couldn't track down. (See EI issue 395, which will investigate these discrepancies in more detail.)

Note that this PR doesn't yet include changes to import PIN geometry, because I haven't been able to get that to run successfully yet.

Connects #59.

…debugging

This reverts commit 128edb9.

…qualizer-av-and-exemptions

… issues with geoarrow

…av-and-exemptions

jeancochrane · 2026-01-20T23:27:23Z

data-raw/cpi/cpi.R

+  # Remove footer lines that do not contain any data
+  filter(
+    !str_detect(
+      vals,
+      regex("printed by the authority|ptax-115", ignore_case = TRUE)
+    )
+  ) %>%


This footer appears to be new as of 2025. See here to examine it: https://tax.illinois.gov/content/dam/soi/en/web/tax/localgovernments/property/documents/cpihistory.pdf

jeancochrane · 2026-01-20T23:28:53Z

data-raw/pin/pin.R

+# Start and end years of data to query, inclusive.
+# Set these to the same value if you want to update only one year of data
+start_year <- 2006
+end_year <- 2024


Adding these in so that we have an easy way of skipping prior years of data whenever we do an update. This is particularly useful right now because I don't have access to the AS400 mirror, which is required in order to reproduce pre-2024 data. At some point we should get that set up, but I don't want it to block us right now.

jeancochrane · 2026-01-20T23:30:07Z

data-raw/pin/pin.R

+# 2023. These values come from the legacy CCAO database, which mirrors the
+# county mainframe.
+# Only query this data if we are pulling data for years up to 2023
+if (start_year <= 2023) {


There are a few conditional branches in this file that split depending on whether we're ingesting data before or after 2023. I considered creating a new file dedicated exclusively to post-2024 data manipulation, since it feels to me like this file will get very messy very fast if they substantially change the data model again in the future (causing us to need to introduce further conditional branches based on year). For now, however, modifying this file feels like the simpler path, and I expect it's also easier to review that way.

Got it, so if I'm understanding correctly, to run this just for 2024 update we'd define start_year and end_year as 2024?

jeancochrane · 2026-01-20T23:31:07Z

data-raw/pin/pin.R

+      # This exemption is new in 2024 and does not exist in the legacy data
+      exe_vet_dis_100 = 0L


This is the second change I made to the pre-2023 query.

…_exe_long`

jeancochrane · 2026-02-18T17:49:26Z

data-raw/pin/pin.R

+          ON C.PIN = BILLS.PIN
+          AND C.TAX_YEAR = BILLS.TAX_YEAR
+      WHERE C.TAX_YEAR >= {start_year}
+        AND C.TAX_YEAR <= 2023


This is one of two changes I made to the pre-2023 query.

jeancochrane · 2026-02-18T17:53:46Z

data-raw/pin/pin.R

+  mutate(
+    exe_vet_dis_lt50 = ifelse(
+      # If the vetdis total from the tax bill export matches the sum of all
+      # individual vetdis exemptions from Athena (ias), then we can be confident
+      # filling the individual vetdis exemptions directly from Athena
+      exe_vet_dis == exe_vet_dis_tot_athena,
+      exe_vet_dis_lt50_athena,
+      ifelse(
+        # If the total from the tax bill export does _not_ match the sum from
+        # Athena, then one of two cases is true, according to our investigation:
+        #
+        #  1. The tax bill export has a vetdis total that is >0 but different
+        #     from the Athena sum. If Athena has a value >0 for this particular
+        #     vetdis exemption, then we use the total from the tax bill export
+        #     for this exemption, because we assume the Athena data just has the
+        #     wrong amount (in theory, it is not possible for multiple vetdis
+        #     exemption types to be >0 for the same PIN). If instead there are
+        #     no vetdis exemptions with a value >0, then we fill the value into
+        #     the >70% vetdis exemption type, which is the most common vetdis
+        #     exemption type in the Athena data
+        #
+        #  2. The tax bill export has a vetdis total of 0, but Athena has
+        #     a sum >0 for vetdis exemptions. In this case, we assume the tax
+        #     bill export is correct, and we fill 0 for all individual
+        #     exemption types.
+        exe_vet_dis > 0 & exe_vet_dis_lt50_athena > 0,
+        exe_vet_dis,
+        0L
+      )
+    ),
+    exe_vet_dis_50_69 = ifelse(
+      exe_vet_dis == exe_vet_dis_tot_athena,
+      exe_vet_dis_50_69_athena,
+      ifelse(
+        exe_vet_dis > 0 & exe_vet_dis_50_69_athena > 0,
+        exe_vet_dis,
+        0L
+      )
+    ),
+    exe_vet_dis_ge70 = ifelse(
+      exe_vet_dis == exe_vet_dis_tot_athena,
+      exe_vet_dis_ge70_athena,
+      case_when(
+        exe_vet_dis > 0 & exe_vet_dis_ge70_athena > 0 ~ exe_vet_dis,
+        # This is the most common type of vetdis exemption, so fill it with
+        # the total from the tax bill export if no vetdis exemption types
+        # have a value >0 in the Athena data
+        exe_vet_dis > 0 & exe_vet_dis_tot_athena == 0 ~ exe_vet_dis,
+        TRUE ~ 0L
+      )
+    ),
+    exe_vet_dis_100 = ifelse(
+      exe_vet_dis == exe_vet_dis_tot_athena,
+      exe_vet_dis_100_athena,
+      ifelse(
+        exe_vet_dis > 0 & exe_vet_dis_100_athena > 0,
+        exe_vet_dis,
+        0L
+      )
+    )
+  ) %>%


The logic here is pretty complicated, but I think it's important that we get it right, since we're making some interpretive decisions here rather than pulling information directly from the tax bill export. Let me know if it would help to walk through it together.

I'm pretty sure I've got it! Did you have the chance to test the mismatch PINs to confirm the provided exemption amounts, EAV and tax rate result in the tax bill total provided?
It makes sense to me to prioritize accuracy of the tax bill export especially if using the athena exemption amounts led to a calculated tax bill amount that does not match tax_bill_total.
And then the logic of keeping the exemption amount in the same exemption tier seems the most straightforward. I wondered if we could create a rule based on the exemption amount, but those seem to be pretty inconsistent.

kyrasturgill

This looks great! Thanks for thinking through the veteran's disability logic - I had a couple questions/thoughts but don't have any concrete changes.

kyrasturgill · 2026-02-26T21:19:26Z

data-raw/pin/pin.R

+# 2023. These values come from the legacy CCAO database, which mirrors the
+# county mainframe.
+# Only query this data if we are pulling data for years up to 2023
+if (start_year <= 2023) {


Got it, so if I'm understanding correctly, to run this just for 2024 update we'd define start_year and end_year as 2024?

kyrasturgill · 2026-02-26T22:53:06Z

data-raw/pin/pin.R

+  mutate(
+    exe_vet_dis_lt50 = ifelse(
+      # If the vetdis total from the tax bill export matches the sum of all
+      # individual vetdis exemptions from Athena (ias), then we can be confident
+      # filling the individual vetdis exemptions directly from Athena
+      exe_vet_dis == exe_vet_dis_tot_athena,
+      exe_vet_dis_lt50_athena,
+      ifelse(
+        # If the total from the tax bill export does _not_ match the sum from
+        # Athena, then one of two cases is true, according to our investigation:
+        #
+        #  1. The tax bill export has a vetdis total that is >0 but different
+        #     from the Athena sum. If Athena has a value >0 for this particular
+        #     vetdis exemption, then we use the total from the tax bill export
+        #     for this exemption, because we assume the Athena data just has the
+        #     wrong amount (in theory, it is not possible for multiple vetdis
+        #     exemption types to be >0 for the same PIN). If instead there are
+        #     no vetdis exemptions with a value >0, then we fill the value into
+        #     the >70% vetdis exemption type, which is the most common vetdis
+        #     exemption type in the Athena data
+        #
+        #  2. The tax bill export has a vetdis total of 0, but Athena has
+        #     a sum >0 for vetdis exemptions. In this case, we assume the tax
+        #     bill export is correct, and we fill 0 for all individual
+        #     exemption types.
+        exe_vet_dis > 0 & exe_vet_dis_lt50_athena > 0,
+        exe_vet_dis,
+        0L
+      )
+    ),
+    exe_vet_dis_50_69 = ifelse(
+      exe_vet_dis == exe_vet_dis_tot_athena,
+      exe_vet_dis_50_69_athena,
+      ifelse(
+        exe_vet_dis > 0 & exe_vet_dis_50_69_athena > 0,
+        exe_vet_dis,
+        0L
+      )
+    ),
+    exe_vet_dis_ge70 = ifelse(
+      exe_vet_dis == exe_vet_dis_tot_athena,
+      exe_vet_dis_ge70_athena,
+      case_when(
+        exe_vet_dis > 0 & exe_vet_dis_ge70_athena > 0 ~ exe_vet_dis,
+        # This is the most common type of vetdis exemption, so fill it with
+        # the total from the tax bill export if no vetdis exemption types
+        # have a value >0 in the Athena data
+        exe_vet_dis > 0 & exe_vet_dis_tot_athena == 0 ~ exe_vet_dis,
+        TRUE ~ 0L
+      )
+    ),
+    exe_vet_dis_100 = ifelse(
+      exe_vet_dis == exe_vet_dis_tot_athena,
+      exe_vet_dis_100_athena,
+      ifelse(
+        exe_vet_dis > 0 & exe_vet_dis_100_athena > 0,
+        exe_vet_dis,
+        0L
+      )
+    )
+  ) %>%


I'm pretty sure I've got it! Did you have the chance to test the mismatch PINs to confirm the provided exemption amounts, EAV and tax rate result in the tax bill total provided?
It makes sense to me to prioritize accuracy of the tax bill export especially if using the athena exemption amounts led to a calculated tax bill amount that does not match tax_bill_total.
And then the logic of keeping the exemption amount in the same exemption tier seems the most straightforward. I wondered if we could create a rule based on the exemption amount, but those seem to be pretty inconsistent.

jeancochrane added 6 commits December 10, 2025 17:10

Add CPI and equalizer for 2024

7308f3d

WIP parse temp tax roll export for pin

30ea571

WIP compare temp tax roll export to Athena data

190108a

Small tweaks to temp tax roll export comparison

390de82

WIP finalize post-2024 PIN table ELT

a4b4484

Clean up pin.R to prep for review

3435d17

jeancochrane changed the base branch from master to 2024-data-update January 2, 2026 20:51

jeancochrane added 18 commits January 2, 2026 16:55

Appease lintr

ec75f24

Update data sources in README

2da2527

Update pre-commit version

34602cf

Add pipe_consistency_linter to lintr ignores

a3e291d

Update pipe_consistency_linter to match current practice for the repo

1247c26

Fix quadruple whitespace that is no longer allowed by style-files

d4c30ae

Tweak failing tests to print output on failure

50ec63f

Change print statements to cat CSV during test debugging

5ed5e56

Temporarily set up tmate session in test-coverage workflow for debugging

c9bfa62

Upload lookup_agency output to GitHub workflow artifacts for further …

6c9ceb8

…debugging

Instead of uploading lookup_agency output, print debug info

128edb9

Revert "Instead of uploading lookup_agency output, print debug info"

d77e95d

This reverts commit 128edb9.

Write agency summary lookup to CSV for debugging

4e46a3a

Save RDS instead of CSV for debugging

64f97ba

Remove debugging from test-coverage.yaml

27605c8

Change snapshot tests to use expect_snapshot_value

1aed6fe

Add comment for json2 serialization to expect_snapshot_value test

3aff22b

Merge branch jeancochrane/fix-pre-commit into jeancochrane/2024-cpi-e…

1413aea

…qualizer-av-and-exemptions

jeancochrane changed the base branch from 2024-data-update to jeancochrane/fix-pre-commit January 12, 2026 16:35

jeancochrane added 2 commits January 13, 2026 16:31

Remove AV/exe analysis between Athena and tax roll export

463e9bf

Appease lintr

6163404

Base automatically changed from jeancochrane/fix-pre-commit to 2024-data-update January 14, 2026 16:17

Add disability levels for 2024 veterans with disabilities in pin.R

c6ecadc

jeancochrane added 3 commits January 20, 2026 16:08

Load geoparquet files individually instead of as a Dataset to resolve…

8c3c59a

… issues with geoarrow

Merge branch '2024-data-update' into jeancochrane/2024-cpi-equalizer-…

5cae65f

…av-and-exemptions

Make sure to include years when pulling PIN geometries from S3

c983235

jeancochrane commented Jan 20, 2026

View reviewed changes

jeancochrane added 2 commits February 18, 2026 11:47

Tweak pin_exe_vetdis_athena definition to pull from `default.vw_pin…

da6d96c

…_exe_long`

Change geometry logic back to main branch

8cfa25a

jeancochrane commented Feb 18, 2026

View reviewed changes

jeancochrane marked this pull request as ready for review February 18, 2026 19:20

jeancochrane requested a review from kyrasturgill as a code owner February 18, 2026 19:20

kyrasturgill reviewed Feb 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add 2024 AVs, exemptions, equalizer, and CPI#63

Add 2024 AVs, exemptions, equalizer, and CPI#63
jeancochrane wants to merge 32 commits into2024-data-updatefrom
jeancochrane/2024-cpi-equalizer-av-and-exemptions

jeancochrane commented Jan 2, 2026 •

edited

Loading

Uh oh!

jeancochrane Jan 20, 2026

Uh oh!

jeancochrane Jan 20, 2026

Uh oh!

jeancochrane Jan 20, 2026 •

edited

Loading

Uh oh!

kyrasturgill Feb 26, 2026

Uh oh!

jeancochrane Jan 20, 2026 •

edited

Loading

Uh oh!

jeancochrane Feb 18, 2026

Uh oh!

jeancochrane Feb 18, 2026

Uh oh!

kyrasturgill Feb 26, 2026

Uh oh!

kyrasturgill left a comment

Uh oh!

kyrasturgill Feb 26, 2026

Uh oh!

kyrasturgill Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		# This exemption is new in 2024 and does not exist in the legacy data
		exe_vet_dis_100 = 0L

Conversation

jeancochrane commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeancochrane Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeancochrane Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kyrasturgill left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jeancochrane commented Jan 2, 2026 •

edited

Loading

jeancochrane Jan 20, 2026 •

edited

Loading

jeancochrane Jan 20, 2026 •

edited

Loading