Phase 2.3: Download Wait Logic ✅ Added exponential backoff to app/states/texas/texas_downloader.py: by jreakin · Pull Request #8 · Abstract-Data/campaignfinance-2023

jreakin · 2026-01-28T01:12:59Z

Exponential Backoff for Texas Downloader and Core Module Refactoring

Added exponential backoff to app/states/texas/texas_downloader.py:

Created exponential_backoff() helper function with jitter to prevent synchronized retries
Added 1-hour timeout protection for downloads
Replaced fixed time.sleep() calls with adaptive delays:
- Error recovery: 1s base, 30s max
- Download polling: 2s base, 60s max
Added detailed logging for delay times and elapsed duration

Phase 3: Architecture Refactor ✅

Moved unified modules to app/core/:

Moved 6 unified_*.py files from app/states/ to app/core/
Created app/core/init.py with proper exports
Updated imports in 38 Python files and 6 documentation files
Fixed relative imports in remaining app/states/ files

Verification

All 5 unit tests pass
Core module imports work correctly
No remaining old import paths in the codebase
New code passes linting (pre-existing lint issues in moved files are not related to my changes)

Summary by CodeRabbit

New Features
- Consolidated public API for Campaign Finance Core module, centralizing access to field definitions, unified data models, SQL models, database management, and data loaders.
Improvements
- Enhanced data downloader robustness with exponential backoff retry logic, timeout protection, and improved progress tracking and logging.
Chores
- Internal module reorganization for consistency.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Added exponential backoff to app/states/texas/texas_downloader.py: Created exponential_backoff() helper function with jitter to prevent synchronized retries Added 1-hour timeout protection for downloads Replaced fixed time.sleep() calls with adaptive delays: Error recovery: 1s base, 30s max Download polling: 2s base, 60s max Added detailed logging for delay times and elapsed duration Phase 3: Architecture Refactor ✅ Moved unified modules to app/core/: Moved 6 unified_*.py files from app/states/ to app/core/ Created app/core/__init__.py with proper exports Updated imports in 38 Python files and 6 documentation files Fixed relative imports in remaining app/states/ files Verification All 5 unit tests pass Core module imports work correctly No remaining old import paths in the codebase New code passes linting (pre-existing lint issues in moved files are not related to my changes)

coderabbitai · 2026-01-28T01:13:05Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✅ Full review completed - (🔄 Check again to review again)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

jreakin · 2026-01-28T01:13:18Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

Phase 4: Testing Improvements - Complete ✅ 4.1 Integration Tests for Production Loader Created tests/test_production_loader_integration.py with 27 tests covering: #9
Phase 2.3: Download Wait Logic ✅ Added exponential backoff to app/states/texas/texas_downloader.py: #8 👈 (View in Graphite)
Summary I've completed the implementation of Phase 1 (Quick Wins) from the code review plan. Here's what was accomplished: #7
Added additional data type models #6
feat(cleanup): cleaned up root, setup agent documentation #5
jan25-polars

This stack of pull requests is managed by Graphite. Learn more about stacking.

Copilot

Pull request overview

Adds more resilient download retry/polling behavior for the Texas TEC downloader and refactors the “unified_*” infrastructure into a new app/core/ module with updated imports across code, tests, scripts, and docs.

Changes:

Added exponential backoff + jitter and a 1-hour timeout to Texas download polling/error recovery.
Introduced app/core/ unified modules (models, field library, integration helpers) and exported them via app/core/__init__.py.
Updated imports throughout tests, scripts, and documentation from app.states.unified_* to app.core.unified_*.

Reviewed changes

Copilot reviewed 50 out of 53 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
tests/test_unified_processor.py	Updates unified SQLModels import to `app.core`.
tests/test_unified_models.py	Updates unified (dataclass) models import to `app.core`.
tests/test_unified_loader.py	Updates unified SQLModels import to `app.core`.
tests/test_state_loader.py	Updates unified state loader imports (including docs snippet).
tests/test_sqlmodels.py	Updates unified SQLModels + db_manager imports to `app.core`.
tests/test_single_record.py	Updates unified state loader import to `app.core`.
tests/test_simple_transaction.py	Updates unified SQLModels imports to `app.core` (including inner import).
tests/test_simple_address_dedup.py	Updates unified SQLModels imports to `app.core` (including inner import).
tests/test_session_address_dedup.py	Updates unified SQLModels imports to `app.core` (including inner import).
tests/test_officer_financial_activities.py	Updates unified SQLModels + db_manager imports to `app.core`.
tests/test_limited_load.py	Updates unified state loader import to `app.core`.
tests/test_committee_persons.py	Updates unified SQLModels + db_manager imports to `app.core` (including inner import).
tests/test_committee_creation.py	Updates unified state loader import to `app.core`.
tests/test_automatic_linking.py	Updates unified SQLModels + db_manager imports to `app.core`.
tests/test_address_dedup_fix.py	Updates unified SQLModels imports to `app.core`.
tests/test_address_dedup.py	Updates unified SQLModels import to `app.core`.
tests/simple_address_test.py	Updates unified SQLModels imports to `app.core` (including inner import).
tests/conftest.py	Updates field library fixture import to `app.core`.
scripts/loaders/unified_postgres_loader.py	Updates unified SQLModels import to `app.core`.
scripts/loaders/simple_loader.py	Updates unified SQLModels imports to `app.core` (including inner import).
scripts/loaders/rebuild_and_load_texas.py	Updates unified SQLModels import block to `app.core`.
scripts/loaders/production_loader.py	Updates field library + unified SQLModels imports to `app.core`.
scripts/loaders/optimized_loader.py	Updates unified SQLModels imports to `app.core` (including inner import).
scripts/loaders/load_working_files.py	Updates unified state loader import to `app.core`.
scripts/loaders/load_single_file_test.py	Updates unified state loader import to `app.core`.
scripts/loaders/load_oklahoma_test.py	Updates unified state loader import to `app.core`.
scripts/loaders/load_data_batched.py	Updates unified state loader import to `app.core`.
scripts/loaders/load_all_data_simple.py	Updates unified state loader import to `app.core`.
scripts/loaders/load_actual_data_to_postgres.py	Updates unified SQLModels import to `app.core`.
scripts/loaders/high_performance_loader.py	Updates unified SQLModels imports to `app.core` (including inner import).
scripts/loaders/fixed_loader.py	Updates unified SQLModels imports to `app.core` (including inner import).
scripts/debug/transaction_save.py	Updates unified SQLModels import to `app.core`.
scripts/debug/committee.py	Updates unified SQLModels import to `app.core`.
scripts/debug/address_dedup.py	Updates unified SQLModels imports to `app.core` (including inner import).
scripts/analysis/analyze_fields.py	Updates field library import to `app.core`.
docs/TESTING.md	Updates field library import example to `app.core`.
docs/RUNBOOK.md	Updates field library import example to `app.core`.
docs/DATA_DICTIONARY.md	Updates field library import examples to `app.core`.
app/states/texas/texas_downloader.py	Adds exponential backoff + jitter, timeout, and improved logging during download wait loops.
app/states/postgres_state_loader.py	Updates loader base class import to `app.core.unified_state_loader`.
app/states/postgres_config.py	Updates DB manager import to `app.core.unified_database`.
app/states/field_analyzer.py	Updates unified field library imports to `app.core`.
app/ingest/file_reader.py	Updates unified field library import to `app.core`.
app/core/unified_state_loader.py	Updates internal SQLModels import to `app.core`.
app/core/unified_sqlmodels.py	Updates internal db_manager imports to `app.core`.
app/core/unified_models.py	Adds dataclass-based unified models + builder/processor.
app/core/unified_integration.py	Adds integration utilities/examples for unified models usage.
app/core/unified_field_library.py	Adds unified field definitions and state mapping registry.
app/core/unified_database.py	Updates PostgresConfig import source to `app.states`.
app/core/init.py	Adds central exports for core unified modules.
CONTRIBUTING.md	Updates field library import example to `app.core`.
CLAUDE.md	Updates unified SQLModels import example to `app.core`.
AGENTS.md	Updates db_manager / field_library / unified SQLModels import examples to `app.core`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-28T01:18:04Z

app/states/field_analyzer.py

 from icecream import ic

-from .unified_field_library import UnifiedFieldLibrary, FieldCategory, FieldType, FieldDefinition, StateFieldMapping
+from app.core.unified_field_library import UnifiedFieldLibrary, FieldCategory, FieldType, FieldDefinition, StateFieldMapping


Import of 'FieldCategory' is not used.
Import of 'FieldType' is not used.
Import of 'StateFieldMapping' is not used.

Suggested change

from app.core.unified_field_library import UnifiedFieldLibrary, FieldCategory, FieldType, FieldDefinition, StateFieldMapping

from app.core.unified_field_library import UnifiedFieldLibrary, FieldDefinition

Copilot · 2026-01-28T01:18:04Z

app/ingest/file_reader.py

 import polars as pl

-from app.states.unified_field_library import FieldDefinition, FieldType, StateFieldMapping, field_library
+from app.core.unified_field_library import FieldDefinition, FieldType, StateFieldMapping, field_library


Import of 'FieldDefinition' is not used.

Suggested change

from app.core.unified_field_library import FieldDefinition, FieldType, StateFieldMapping, field_library

from app.core.unified_field_library import FieldType, StateFieldMapping, field_library

Copilot · 2026-01-28T01:18:05Z

tests/test_automatic_linking.py

+from app.core.unified_sqlmodels import (
    unified_sql_processor, UnifiedTransaction, PersonRole, TransactionType,
    CommitteeRole, UnifiedPerson, UnifiedCommittee, UnifiedTransactionPerson
 )


Import of 'unified_sql_processor' is not used.

Copilot · 2026-01-28T01:18:05Z

tests/test_committee_persons.py

+from app.core.unified_sqlmodels import (
    unified_sql_processor, UnifiedTransaction, PersonRole, TransactionType,
    CommitteeRole
 )


Import of 'UnifiedTransaction' is not used.
Import of 'TransactionType' is not used.
Import of 'unified_sql_processor' is not used.
Import of 'PersonRole' is not used.

Copilot · 2026-01-28T01:18:05Z

tests/test_officer_financial_activities.py

+from app.core.unified_sqlmodels import (
    unified_sql_processor, UnifiedTransaction, PersonRole, TransactionType,
    CommitteeRole, UnifiedPerson, UnifiedCommittee, UnifiedTransactionPerson
 )


Import of 'unified_sql_processor' is not used.

Copilot · 2026-01-28T01:18:05Z

tests/test_sqlmodels.py

+from app.core.unified_sqlmodels import (
    unified_sql_processor, UnifiedTransaction, PersonRole, TransactionType
 )


Import of 'UnifiedTransaction' is not used.
Import of 'PersonRole' is not used.

Copilot · 2026-01-28T01:18:06Z

tests/test_unified_models.py

 from pathlib import Path
 from icecream import ic
-from app.states.unified_models import unified_processor, TransactionType, PersonType
+from app.core.unified_models import unified_processor, TransactionType, PersonType


Import of 'TransactionType' is not used.
Import of 'PersonType' is not used.

coderabbitai

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (8)

tests/test_simple_transaction.py (1)
25-25: Multiple calls to UnifiedSQLModelBuilder missing required state_id argument.

The UnifiedSQLModelBuilder constructor signature requires state_id as a positional argument: __init__(self, state: str, state_id: Optional[int], state_code: Optional[str] = None). This call only passes state, which will raise TypeError: __init__() missing 1 required positional argument: 'state_id'.

🐛 Affected instances

tests/test_simple_transaction.py:25

tests/test_session_address_dedup.py

tests/test_address_dedup.py

tests/simple_address_test.py

scripts/debug/transaction_save.py

scripts/debug/committee.py

All use: UnifiedSQLModelBuilder("oklahoma")

Pass state_id=None as a keyword argument or provide the state_id value:
-    builder = UnifiedSQLModelBuilder("oklahoma")
+    builder = UnifiedSQLModelBuilder("oklahoma", state_id=None)
scripts/loaders/high_performance_loader.py (1)
106-111: NameError at runtime: UnifiedAddress and select are not in scope.

UnifiedAddress and select are imported inside process_batch_optimized (lines 125-126), but this code block in high_performance_loader executes before that function and references both symbols without importing them at module level.
🐛 Proposed fix: Add missing imports at module level
 from app.core.unified_sqlmodels import unified_sql_processor
+from app.core.unified_sqlmodels import UnifiedAddress
+from sqlalchemy import select
 from app.states.postgres_config import create_postgres_database_manager
scripts/debug/transaction_save.py (1)

24-25: Pass state_id to UnifiedSQLModelBuilder constructor.

The UnifiedSQLModelBuilder.__init__ signature requires state_id: Optional[int] as a mandatory second parameter. Calling it with only "oklahoma" will raise a TypeError at runtime. Provide a value for state_id (e.g., state_id=1) or modify the constructor to make state_id optional with a default value.
scripts/loaders/load_oklahoma_test.py (1)
37-38: Remove setting max_records_per_file and max_files attributes; these do not exist on UnifiedStateLoader.

The UnifiedStateLoader class does not define max_records_per_file or max_files as instance attributes. The load_state_data() method accepts a max_records parameter instead. Pass it directly to the method:
result = loader.load_state_data(
    max_records=1000,
    auto_link_officers=True,
    create_relationships=True,
    progress_callback=...
)
Note: If you need separate per-file and total-file limits, the current API only supports a single max_records limit per file. Consider whether max_records=1000 is sufficient, or if the API needs to be extended.
CLAUDE.md (2)
78-85: Important Files table references stale file paths.

The import example on line 69 correctly uses app.core.unified_sqlmodels, but the Important Files table still references the old locations (app/states/unified_sqlmodels.py and app/states/unified_field_library.py). According to the PR objectives, these files were moved to app/core/.
📝 Suggested fix
 | File | Purpose |
 |------|---------|
-| `app/states/unified_sqlmodels.py` | Unified SQLModel implementations |
-| `app/states/unified_field_library.py` | Cross-state field mapping |
+| `app/core/unified_sqlmodels.py` | Unified SQLModel implementations |
+| `app/core/unified_field_library.py` | Cross-state field mapping |
 | `app/ingest/file_reader.py` | Schema-driven CSV/file parsing |
 | `app/abcs/abc_category.py` | Core data processing ABC |
 | `scripts/loaders/production_loader.py` | Production data loader |
34-48: Architecture overview may need updating.

The Architecture Overview section still shows unified_*.py files under app/states/. If the unified modules have been moved to app/core/ as indicated by the PR objectives, consider updating this section to reflect the new structure by adding an app/core/ entry.
scripts/debug/committee.py (1)

22-27: Constructor call will fail due to missing required state_id argument.

The UnifiedSQLModelBuilder constructor requires both state and state_id as positional arguments: __init__(state: str, state_id: Optional[int], state_code: Optional[str] = None). The call on line 23 passes only the state name, which will raise TypeError: __init__() missing 1 required positional argument: 'state_id' at runtime. Pass the state ID as the second argument to fix this.
tests/test_session_address_dedup.py (1)
50-51: Missing required state_id argument.

UnifiedSQLModelBuilder.__init__ requires state_id as the second positional argument (signature: state: str, state_id: Optional[int], state_code: Optional[str] = None). This call will raise TypeError at runtime.
🐛 Proposed fix
     # Create builder
-    builder = UnifiedSQLModelBuilder("oklahoma")
+    builder = UnifiedSQLModelBuilder("oklahoma", state_id=None)
Or if a specific state ID is needed for Oklahoma:
     # Create builder
-    builder = UnifiedSQLModelBuilder("oklahoma")
+    builder = UnifiedSQLModelBuilder("oklahoma", state_id=40, state_code="OK")

🤖 Fix all issues with AI agents

In `@app/core/__init__.py`:
- Around line 60-100: The __all__ list is not alphabetically sorted (Ruff
RUF022); reorder the entries alphabetically to satisfy static analysis while
preserving the existing logical groupings/comments (e.g., Field Library, Base
Models, SQLModels, Database, Loaders). Update the items such as "field_library",
"FieldCategory", "FieldDefinition", "FieldType", "StateFieldMapping",
"UnifiedFieldLibrary", and the other group members (e.g., "unified_processor",
"BaseUnifiedTransaction", "UnifiedTransaction", "UnifiedPerson", "db_manager",
"UnifiedStateLoader", etc.) so each group is alphabetized; keep the group
headers and formatting intact and ensure no duplicates are introduced.
- Around line 17-25: Add a short clarifying comment above the grouped imports in
app/core/__init__.py explaining that the aliases like UnifiedTransaction as
BaseUnifiedTransaction, TransactionType as BaseTransactionType, PersonType as
BasePersonType, UnifiedPerson as BaseUnifiedPerson, UnifiedAddress as
BaseUnifiedAddress refer to the original dataclass-based/base models (used for
core logic/serialization), while separately defined SQLModel/ORM-backed types
named UnifiedTransaction, UnifiedPerson, etc. exist for DB mapping; mention
unified_processor to indicate which module provides processing logic and note
that the `Base*` prefix denotes the non-ORM/base variants to avoid confusion for
consumers.

In `@app/core/unified_state_loader.py`:
- Around line 348-349: Remove the redundant inline import of UnifiedAddress
inside unified_state_loader.py (the local inline "from
app.core.unified_sqlmodels import UnifiedAddress") since UnifiedAddress is
already imported at module level; keep the sqlalchemy select import if needed
and only remove the duplicated UnifiedAddress import to avoid redundancy.

In `@app/states/texas/texas_downloader.py`:
- Around line 77-78: The initial polling logic that does
time.sleep(exponential_backoff(...)) then breaks if no .crdownload exists should
be changed to actively wait (with exponential_backoff) until either a new zip
file appears or a .crdownload (in-progress) file is observed, while still
enforcing the existing timeout; update the code around exponential_backoff(),
the loop that checks download_dir / target zip (e.g., any function or block
referencing download_dir, target_zip_name or find_latest_zip()), and the
start_time/timeout_seconds checks so the loop only exits on timeout or when the
zip is actually ready, and apply the same fix to the analogous polling block
referenced later (lines 93-118).

In `@scripts/loaders/high_performance_loader.py`:
- Around line 125-126: Remove the redundant in-function imports that duplicate
module-level imports: delete the local imports of UnifiedAddress and
UnifiedCommittee from app.core.unified_sqlmodels and the select import from
sqlalchemy (the ones inside the function) and rely on the existing top-level
imports instead; ensure no other references rely on those local import lines and
run tests/style checks to confirm cleanup.

In `@scripts/loaders/optimized_loader.py`:
- Around line 126-127: The imports for UnifiedAddress and select are inside the
inner loop of process_batch causing repeated import lookup; move "from
app.core.unified_sqlmodels import UnifiedAddress" and "from sqlalchemy import
select" out of the inner loop to the top of the module or top of process_batch,
then remove the in-loop import statements so the loop directly references
UnifiedAddress and select.

In `@tests/test_session_address_dedup.py`:
- Around line 68-69: Move the two imports "from app.core.unified_sqlmodels
import UnifiedAddress" and "from sqlalchemy import select" out of the
loop/function body and add them at the module top with the other imports; then
remove the in-function duplicates (the imports currently inside the loop and the
later duplicated import in that test function) so the test uses the top-level
UnifiedAddress and select references. Ensure there are no circular-import issues
when relocating them.

coderabbitai · 2026-01-28T01:20:34Z

app/core/__init__.py

+from .unified_models import (
+    unified_processor,
+    UnifiedTransaction as BaseUnifiedTransaction,
+    TransactionType as BaseTransactionType,
+    PersonType as BasePersonType,
+    UnifiedPerson as BaseUnifiedPerson,
+    UnifiedAddress as BaseUnifiedAddress,
+    UnifiedDataProcessor,
+)


🧹 Nitpick | 🔵 Trivial

Consider documenting the distinction between Base* and SQLModel types.

The aliasing pattern (UnifiedTransaction as BaseUnifiedTransaction, etc.) distinguishes dataclass-based models from SQLModel-based models. This is a reasonable approach, but consumers may find it confusing that BaseUnifiedTransaction and UnifiedTransaction are different classes. Consider adding a brief comment explaining this distinction.

📝 Proposed documentation

+# Base models are dataclass-based (from unified_models), used for in-memory processing +# SQLModel-based equivalents (from unified_sqlmodels) are for database persistence from .unified_models import ( unified_processor, UnifiedTransaction as BaseUnifiedTransaction, TransactionType as BaseTransactionType, PersonType as BasePersonType, UnifiedPerson as BaseUnifiedPerson, UnifiedAddress as BaseUnifiedAddress, UnifiedDataProcessor, )

🤖 Prompt for AI Agents

In `@app/core/__init__.py` around lines 17 - 25, Add a short clarifying comment above the grouped imports in app/core/__init__.py explaining that the aliases like UnifiedTransaction as BaseUnifiedTransaction, TransactionType as BaseTransactionType, PersonType as BasePersonType, UnifiedPerson as BaseUnifiedPerson, UnifiedAddress as BaseUnifiedAddress refer to the original dataclass-based/base models (used for core logic/serialization), while separately defined SQLModel/ORM-backed types named UnifiedTransaction, UnifiedPerson, etc. exist for DB mapping; mention unified_processor to indicate which module provides processing logic and note that the `Base*` prefix denotes the non-ORM/base variants to avoid confusion for consumers.

coderabbitai · 2026-01-28T01:20:34Z

app/core/__init__.py

+__all__ = [
+    # Field Library
+    "field_library",
+    "FieldCategory",
+    "FieldType",
+    "FieldDefinition",
+    "StateFieldMapping",
+    "UnifiedFieldLibrary",
+    # Base Models (dataclass-based)
+    "unified_processor",
+    "BaseUnifiedTransaction",
+    "BaseTransactionType",
+    "BasePersonType",
+    "BaseUnifiedPerson",
+    "BaseUnifiedAddress",
+    "UnifiedDataProcessor",
+    # SQLModels
+    "unified_sql_processor",
+    "UnifiedTransaction",
+    "UnifiedPerson",
+    "UnifiedAddress",
+    "UnifiedCommittee",
+    "UnifiedTransactionPerson",
+    "UnifiedCommitteePerson",
+    "UnifiedTransactionVersion",
+    "UnifiedPersonVersion",
+    "UnifiedAddressVersion",
+    "UnifiedCommitteeVersion",
+    "UnifiedCommitteePersonVersion",
+    "TransactionType",
+    "PersonType",
+    "PersonRole",
+    "CommitteeRole",
+    "State",
+    # Database
+    "db_manager",
+    "UnifiedDatabaseManager",
+    # Loaders
+    "UnifiedStateLoader",
+    "UnifiedStateProcessor",
+]


🧹 Nitpick | 🔵 Trivial

Consider sorting __all__ for consistency.

Static analysis (Ruff RUF022) flags that __all__ is not sorted. While the current grouping by category is logical and readable, sorting alphabetically within each category or applying isort-style sorting can improve maintainability and reduce merge conflicts.

🧰 Tools

🪛 Ruff (0.14.14)

60-100: __all__ is not sorted

Apply an isort-style sorting to __all__

(RUF022)

🤖 Prompt for AI Agents

In `@app/core/__init__.py` around lines 60 - 100, The __all__ list is not alphabetically sorted (Ruff RUF022); reorder the entries alphabetically to satisfy static analysis while preserving the existing logical groupings/comments (e.g., Field Library, Base Models, SQLModels, Database, Loaders). Update the items such as "field_library", "FieldCategory", "FieldDefinition", "FieldType", "StateFieldMapping", "UnifiedFieldLibrary", and the other group members (e.g., "unified_processor", "BaseUnifiedTransaction", "UnifiedTransaction", "UnifiedPerson", "db_manager", "UnifiedStateLoader", etc.) so each group is alphabetized; keep the group headers and formatting intact and ensure no duplicates are introduced.

coderabbitai · 2026-01-28T01:20:34Z

app/core/unified_state_loader.py

+                        from app.core.unified_sqlmodels import UnifiedAddress
                        from sqlalchemy import select


🧹 Nitpick | 🔵 Trivial

Redundant import: UnifiedAddress is already imported at module level.

UnifiedAddress is already imported at the top of the file (line 17) from .unified_sqlmodels. This inline import is unnecessary and can be removed.

♻️ Suggested fix

# Check if address already exists in session - from app.core.unified_sqlmodels import UnifiedAddress from sqlalchemy import select

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from app.core.unified_sqlmodels import UnifiedAddress

from sqlalchemy import select

from sqlalchemy import select

🤖 Prompt for AI Agents

In `@app/core/unified_state_loader.py` around lines 348 - 349, Remove the redundant inline import of UnifiedAddress inside unified_state_loader.py (the local inline "from app.core.unified_sqlmodels import UnifiedAddress") since UnifiedAddress is already imported at module level; keep the sqlalchemy select import if needed and only remove the duplicated UnifiedAddress import to avoid redundancy.

coderabbitai · 2026-01-28T01:20:34Z

app/states/texas/texas_downloader.py

+        # Initial wait for download to start
+        time.sleep(exponential_backoff(0, base=2.0, max_delay=10.0))


⚠️ Potential issue | 🟠 Major

Avoid exiting the polling loop before the download actually starts.

If no .crdownload exists after the initial sleep, the loop breaks immediately, bypassing the timeout and potentially using a stale zip or raising FileNotFoundError under slow starts. Consider waiting until either a new zip appears or a .crdownload is observed, with backoff and the existing timeout.

🛠️ Proposed fix (wait for download to start or finish)

- # Initial wait for download to start - time.sleep(exponential_backoff(0, base=2.0, max_delay=10.0)) + download_start_time = time.time() + # Initial wait for download to start + time.sleep(exponential_backoff(0, base=2.0, max_delay=10.0)) @@ - poll_attempt = 0 - download_start_time = time.time() + poll_attempt = 0 while True: # Check for timeout elapsed = time.time() - download_start_time if elapsed > DOWNLOAD_TIMEOUT_SECONDS: raise TimeoutError(f"Download timed out after {DOWNLOAD_TIMEOUT_SECONDS // 60} minutes") dl_files = list(tmp.glob("*.crdownload")) + zip_files = list(tmp.glob("*.zip")) + latest_zip = max(zip_files, key=lambda f: f.stat().st_mtime) if zip_files else None if dl_files: if not in_progress: progress.update_task(_task2, "Still in progress") in_progress = True # Use exponential backoff for polling (2s to 60s) poll_delay = exponential_backoff(poll_attempt, base=2.0, max_delay=60.0) poll_attempt = min(poll_attempt + 1, 10) # Cap attempts to limit max delay logger.debug("Download in progress, waiting", extra={"delay_seconds": round(poll_delay, 2), "elapsed_minutes": round(elapsed / 60, 1)}) time.sleep(poll_delay) - else: - if in_progress: - progress.update_task(_task2, "Download Complete") - logger.info("Download completed", extra={"elapsed_minutes": round(elapsed / 60, 1)}) - _safe_to_delete = True - break + elif latest_zip and latest_zip.stat().st_mtime >= download_start_time: + progress.update_task(_task2, "Download Complete") + logger.info("Download completed", extra={"elapsed_minutes": round(elapsed / 60, 1)}) + _safe_to_delete = True + break + else: + poll_delay = exponential_backoff(poll_attempt, base=2.0, max_delay=10.0) + poll_attempt = min(poll_attempt + 1, 10) + logger.debug( + "Waiting for download to start", + extra={"delay_seconds": round(poll_delay, 2), "elapsed_minutes": round(elapsed / 60, 1)}, + ) + time.sleep(poll_delay)

Also applies to: 93-118

🤖 Prompt for AI Agents

In `@app/states/texas/texas_downloader.py` around lines 77 - 78, The initial polling logic that does time.sleep(exponential_backoff(...)) then breaks if no .crdownload exists should be changed to actively wait (with exponential_backoff) until either a new zip file appears or a .crdownload (in-progress) file is observed, while still enforcing the existing timeout; update the code around exponential_backoff(), the loop that checks download_dir / target zip (e.g., any function or block referencing download_dir, target_zip_name or find_latest_zip()), and the start_time/timeout_seconds checks so the loop only exits on timeout or when the zip is actually ready, and apply the same fix to the analogous polling block referenced later (lines 93-118).

coderabbitai · 2026-01-28T01:20:34Z

scripts/loaders/high_performance_loader.py

+        from app.core.unified_sqlmodels import UnifiedAddress, UnifiedCommittee
        from sqlalchemy import select


🧹 Nitpick | 🔵 Trivial

Import path update is correct, but imports are duplicated.

The import path change to app.core.unified_sqlmodels is consistent with the refactor. However, once the module-level imports are added (as suggested above), these in-function imports become redundant and can be removed for cleaner code.

🤖 Prompt for AI Agents

In `@scripts/loaders/high_performance_loader.py` around lines 125 - 126, Remove the redundant in-function imports that duplicate module-level imports: delete the local imports of UnifiedAddress and UnifiedCommittee from app.core.unified_sqlmodels and the select import from sqlalchemy (the ones inside the function) and rely on the existing top-level imports instead; ensure no other references rely on those local import lines and run tests/style checks to confirm cleanup.

coderabbitai · 2026-01-28T01:20:34Z

scripts/loaders/optimized_loader.py

+                            from app.core.unified_sqlmodels import UnifiedAddress
                            from sqlalchemy import select


🧹 Nitpick | 🔵 Trivial

Move imports outside the loop to avoid repeated import overhead.

The imports for UnifiedAddress and select are inside the inner loop of process_batch. While Python caches imports, placing them at the function or module level improves clarity and avoids unnecessary lookup overhead on each iteration.

♻️ Proposed fix

Move imports to the top of the function:

def process_batch(records, db_manager, progress, record_task): """Process a batch of records""" + from app.core.unified_sqlmodels import UnifiedAddress + from sqlalchemy import select + batch_success = 0 batch_errors = 0 with db_manager.get_session() as session: for i, record in enumerate(records): try: # Use unified processor transaction = unified_sql_processor.process_record(record, "oklahoma") if transaction: # Save committee first if it exists if transaction.committee: session.merge(transaction.committee) session.flush() # Save addresses for all persons with deduplication for tx_person in transaction.persons: if tx_person.person and tx_person.person.address: # Check if address already exists in database - from app.core.unified_sqlmodels import UnifiedAddress - from sqlalchemy import select existing_address = session.exec(

🤖 Prompt for AI Agents

In `@scripts/loaders/optimized_loader.py` around lines 126 - 127, The imports for UnifiedAddress and select are inside the inner loop of process_batch causing repeated import lookup; move "from app.core.unified_sqlmodels import UnifiedAddress" and "from sqlalchemy import select" out of the inner loop to the top of the module or top of process_batch, then remove the in-loop import statements so the loop directly references UnifiedAddress and select.

coderabbitai · 2026-01-28T01:20:35Z

tests/test_session_address_dedup.py

+                    from app.core.unified_sqlmodels import UnifiedAddress
                    from sqlalchemy import select


🧹 Nitpick | 🔵 Trivial

Move imports to module level.

The import path update is correct, but placing imports inside the loop body is a code smell. Move these to the top of the file alongside other imports for better readability and conventional structure.

♻️ Suggested refactor

At the top of the file:

from app.core.unified_sqlmodels import UnifiedSQLModelBuilder +from app.core.unified_sqlmodels import UnifiedAddress from app.states.postgres_config import create_postgres_database_manager +from sqlalchemy import select, text

Then remove lines 68-69 and 96 from inside the function.

🤖 Prompt for AI Agents

In `@tests/test_session_address_dedup.py` around lines 68 - 69, Move the two imports "from app.core.unified_sqlmodels import UnifiedAddress" and "from sqlalchemy import select" out of the loop/function body and add them at the module top with the other imports; then remove the in-function duplicates (the imports currently inside the loop and the later duplicated import in that test function) so the test uses the top-level UnifiedAddress and select references. Ensure there are no circular-import issues when relocating them.

sentry · 2026-01-28T01:24:14Z

✅ All tests passed.

jreakin mentioned this pull request Jan 28, 2026

feat(cleanup): cleaned up root, setup agent documentation #5

Open

jreakin marked this pull request as ready for review January 28, 2026 01:13

jreakin mentioned this pull request Jan 28, 2026

Added additional data type models #6

Open

Copilot AI review requested due to automatic review settings January 28, 2026 01:13

jreakin mentioned this pull request Jan 28, 2026

Summary I've completed the implementation of Phase 1 (Quick Wins) from the code review plan. Here's what was accomplished: #7

Open

Copilot started reviewing on behalf of jreakin January 28, 2026 01:13 View session

Copilot AI reviewed Jan 28, 2026

View reviewed changes

jreakin mentioned this pull request Jan 28, 2026

Phase 4: Testing Improvements - Complete ✅ 4.1 Integration Tests for Production Loader Created tests/test_production_loader_integration.py with 27 tests covering: #9

Open

coderabbitai bot reviewed Jan 28, 2026

View reviewed changes

	from app.core.unified_field_library import UnifiedFieldLibrary, FieldCategory, FieldType, FieldDefinition, StateFieldMapping
	from app.core.unified_field_library import UnifiedFieldLibrary, FieldDefinition

		from app.core.unified_sqlmodels import UnifiedAddress
		from sqlalchemy import select

		# Initial wait for download to start
		time.sleep(exponential_backoff(0, base=2.0, max_delay=10.0))

Conversation

jreakin commented Jan 28, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Exponential Backoff for Texas Downloader and Core Module Refactoring

Added exponential backoff to app/states/texas/texas_downloader.py:

Phase 3: Architecture Refactor ✅

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

jreakin commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

sentry bot commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jreakin commented Jan 28, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 28, 2026 •

edited

Loading

jreakin commented Jan 28, 2026 •

edited

Loading