Add snowflake-cortex-code-nhtsa-safety-intelligence connector by kellykohlleffel · Pull Request #561 · fivetran/fivetran_connector_sdk

kellykohlleffel · 2026-04-12T03:35:36Z

Summary

Adds a new AI-enriched connector in all_things_ai/tutorials/snowflake-cortex-code-nhtsa-safety-intelligence/ that demonstrates Agent-Driven Discovery -- a pattern where Snowflake Cortex COMPLETE autonomously guides data collection during Fivetran ingestion.

Built entirely with Snowflake Cortex Code.

What it does

Phase 1 (Seed): Fetches recalls, complaints, and vehicle specs from NHTSA public APIs for a user-configured seed vehicle
Phase 2 (Discover): Sends the seed data to Snowflake Cortex COMPLETE, which analyzes safety patterns and recommends related vehicles to fetch -- the connector then fetches those automatically
Phase 3 (Synthesize): A second Cortex COMPLETE call performs cross-vehicle synthesis, generating fleet-wide safety grades and risk assessments

Key details

5 tables: recalls, complaints, vehicle_specs, discovery_insights, safety_analysis
2 Cortex COMPLETE calls per sync: discovery analysis + cross-vehicle synthesis
PAT token auth for Snowflake Cortex REST API (/api/v2/cortex/inference:complete)
Resilient discovery: per-vehicle try/except so one failed fetch does not break the sync
SDK v2+ compliant: no yield, required comments on all upsert/checkpoint calls, template docstrings
No requirements.txt: only uses requests (provided by Fivetran runtime)

Files

File	Description
`connector.py`	973-line connector with 3-phase Agent-Driven Discovery
`configuration.json`	Config with angle-bracket placeholders for Snowflake credentials
`README.md`	Full documentation following SDK README template

Testing

Deployed and synced successfully via Fivetran production (connection arguing_dearly, package sulfate_implant).

fivetran debug output (935 upserts, SYNC SUCCEEDED)

11-Apr 21:47:55.101 INFO sdk > debugging connector
11-Apr 21:47:55.412 INFO debugger connector tester version: 2.26.0410.001
11-Apr 21:47:57.960 INFO sdk calling schema()
11-Apr 21:47:57.973 INFO debugger schema change detected: tester.recalls
11-Apr 21:47:57.974 INFO debugger schema change detected: tester.complaints
11-Apr 21:47:57.974 INFO debugger schema change detected: tester.vehicle_specs
11-Apr 21:47:57.974 INFO debugger schema change detected: tester.discovery_insights
11-Apr 21:47:57.974 INFO debugger schema change detected: tester.safety_analysis
11-Apr 21:47:57.980 INFO sdk calling update()
11-Apr 21:47:57.980 WARNING Example: all_things_ai/tutorials : snowflake-cortex-code-nhtsa-safety-intelligence
11-Apr 21:47:57.980 INFO Phase 1: Fetching seed vehicle data for Toyota Tundra 2022
11-Apr 21:47:58.137 INFO Fetched 13 recalls for Toyota Tundra 2022
11-Apr 21:47:58.745 INFO Fetched 394 complaints for Toyota Tundra 2022
11-Apr 21:47:59.523 INFO Fetched 58 model specs for make: Toyota
11-Apr 21:48:00.035 INFO Phase 1 complete: 13 recalls, 394 complaints, 58 vehicle specs
11-Apr 21:48:00.035 INFO Phase 2: Starting Agent-Driven Discovery
11-Apr 21:48:00.036 INFO Calling Cortex Agent for discovery analysis
11-Apr 21:48:00.847 INFO debugger checkpoint recorded: {}
11-Apr 21:48:12.822 INFO Agent recommended 3 vehicles, fetching 3
11-Apr 21:48:12.822 INFO Fetching discovered vehicle: Toyota Tundra 2023
11-Apr 21:48:12.856 INFO Fetched 13 recalls for Toyota Tundra 2023
11-Apr 21:48:13.401 INFO Fetched 336 complaints for Toyota Tundra 2023
11-Apr 21:48:13.908 INFO Discovered vehicle Toyota Tundra 2023: 13 recalls, 336 complaints
11-Apr 21:48:14.058 INFO Fetched 58 model specs for make: Toyota
11-Apr 21:48:14.563 INFO Fetching discovered vehicle: Toyota Sequoia 2022
11-Apr 21:48:14.593 INFO Fetched 2 recalls for Toyota Sequoia 2022
11-Apr 21:48:15.131 INFO Fetched 1 complaints for Toyota Sequoia 2022
11-Apr 21:48:15.634 INFO Discovered vehicle Toyota Sequoia 2022: 2 recalls, 1 complaints
11-Apr 21:48:15.824 INFO Fetched 58 model specs for make: Toyota
11-Apr 21:48:16.330 INFO Fetching discovered vehicle: Ram 1500 2022
11-Apr 21:48:16.361 INFO Fetched 15 recalls for Ram 1500 2022
11-Apr 21:48:17.354 SEVERE Request failed after 1 attempt(s). URL: complaints API, Status: 400
11-Apr 21:48:17.354 WARNING Failed to fetch data for discovered vehicle Ram 1500 2022. Skipping.
11-Apr 21:48:17.354 INFO Phase 3: Starting cross-vehicle synthesis
11-Apr 21:48:17.355 INFO Calling Cortex Agent for cross-vehicle synthesis
11-Apr 21:48:38.672 INFO Synthesis complete: fleet grade C+, 3 vehicles analyzed
11-Apr 21:48:38.672 INFO Sync complete for all phases
11-Apr 21:48:38.700 INFO debugger SYNC SUCCEEDED - total elapsed 00:00:41
Operation       | Counts
----------------+------------
Upserts         | 935
Updates         | 0
Deletes         | 0
Truncates       | 0
Schema changes  | 5
Checkpoints     | 7

Checklist

Agent-Driven Discovery pattern built with Snowflake Cortex Code: syncs NHTSA recalls, complaints, and vehicle specs, then uses Snowflake Cortex COMPLETE to analyze safety patterns and autonomously discover related vehicles during Fivetran ingestion. - 5 tables: recalls, complaints, vehicle_specs, discovery_insights, safety_analysis - Two Cortex COMPLETE calls per sync: discovery analysis + cross-vehicle synthesis - PAT token authentication for Snowflake Cortex REST API - Resilient per-vehicle error handling for agent-recommended fetches - SDK v2+ compliant (no yield, required comments, template docstrings)

Copilot

Pull request overview

Adds a new AI tutorial connector under all_things_ai/tutorials/snowflake-cortex-code-nhtsa-safety-intelligence/ demonstrating “Agent-Driven Discovery” by syncing NHTSA vehicle safety data and optionally using Snowflake Cortex COMPLETE to recommend and synthesize additional vehicle investigations.

Changes:

Added new tutorial connector (Python) that syncs NHTSA recalls/complaints/specs and optionally calls Cortex COMPLETE for discovery + synthesis.
Added tutorial documentation (README) and a configuration.json for the new connector.
Updated the repo root README.md to include the new tutorial link.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 10 comments.

File	Description
README.md	Adds the new tutorial connector link to the AI tutorials list.
all_things_ai/tutorials/snowflake-cortex-code-nhtsa-safety-intelligence/README.md	Documents setup, configuration, behavior, and tables for the new connector.
all_things_ai/tutorials/snowflake-cortex-code-nhtsa-safety-intelligence/connector.py	Implements the 3-phase NHTSA ingestion + optional Cortex-driven discovery and synthesis pipeline.
all_things_ai/tutorials/snowflake-cortex-code-nhtsa-safety-intelligence/configuration.json	Provides example configuration for running the connector.

Copilot · 2026-04-13T09:44:59Z

+    "seed_make": "Toyota",
+    "seed_model": "Tundra",
+    "seed_year": "2022",
+    "discovery_depth": "2",
+    "max_discoveries": "3",
+    "enable_cortex": "true",
+    "snowflake_account": "<your-snowflake-account>.snowflakecomputing.com",
+    "snowflake_pat_token": "<your-snowflake-pat-token>",
+    "cortex_model": "llama3.3-70b",
+    "max_enrichments": "10"


configuration.json values should all be angle-bracket placeholders (per repo configuration guidelines). Right now seed_make, seed_model, and seed_year are real example values, which violates the required placeholder format and encourages committing non-placeholder configs.

Suggested change

"seed_make": "Toyota",

"seed_model": "Tundra",

"seed_year": "2022",

"discovery_depth": "2",

"max_discoveries": "3",

"enable_cortex": "true",

"snowflake_account": "<your-snowflake-account>.snowflakecomputing.com",

"snowflake_pat_token": "<your-snowflake-pat-token>",

"cortex_model": "llama3.3-70b",

"max_enrichments": "10"

"seed_make": "<YOUR_VEHICLE_MAKE>",

"seed_model": "<YOUR_VEHICLE_MODEL>",

"seed_year": "<YOUR_VEHICLE_YEAR>",

"discovery_depth": "<DISCOVERY_DEPTH>",

"max_discoveries": "<MAX_DISCOVERIES>",

"enable_cortex": "<ENABLE_CORTEX_TRUE_OR_FALSE>",

"snowflake_account": "<YOUR_SNOWFLAKE_ACCOUNT_HOSTNAME>",

"snowflake_pat_token": "<YOUR_SNOWFLAKE_PAT_TOKEN>",

"cortex_model": "<YOUR_CORTEX_MODEL_NAME>",

"max_enrichments": "<MAX_ENRICHMENTS>"

Fixed in commit cffbcd94 — all configuration values now use angle-bracket placeholders including seed_make, seed_model, seed_year, and all numeric/boolean defaults.

Copilot · 2026-04-13T09:45:00Z

+    "seed_make": "Toyota",
+    "seed_model": "Tundra",
+    "seed_year": "2022",
+    "discovery_depth": "2",
+    "max_discoveries": "3",
+    "enable_cortex": "true",
+    "snowflake_account": "<your-snowflake-account>.snowflakecomputing.com",
+    "snowflake_pat_token": "<your-snowflake-pat-token>",
+    "cortex_model": "llama3.3-70b",
+    "max_enrichments": "10"


configuration.json values like discovery_depth, max_discoveries, enable_cortex, cortex_model, and max_enrichments should also be expressed as descriptive angle-bracket placeholders (not concrete defaults like "2"/"true"/"llama3.3-70b"). This file should be safe to commit without implying recommended production settings.

Suggested change

"seed_make": "Toyota",

"seed_model": "Tundra",

"seed_year": "2022",

"discovery_depth": "2",

"max_discoveries": "3",

"enable_cortex": "true",

"snowflake_account": "<your-snowflake-account>.snowflakecomputing.com",

"snowflake_pat_token": "<your-snowflake-pat-token>",

"cortex_model": "llama3.3-70b",

"max_enrichments": "10"

"seed_make": "<SEED_VEHICLE_MAKE>",

"seed_model": "<SEED_VEHICLE_MODEL>",

"seed_year": "<SEED_VEHICLE_YEAR>",

"discovery_depth": "<DISCOVERY_DEPTH>",

"max_discoveries": "<MAX_DISCOVERIES>",

"enable_cortex": "<ENABLE_CORTEX_TRUE_OR_FALSE>",

"snowflake_account": "<YOUR_SNOWFLAKE_ACCOUNT>.snowflakecomputing.com",

"snowflake_pat_token": "<YOUR_SNOWFLAKE_PAT_TOKEN>",

"cortex_model": "<CORTEX_MODEL_NAME>",

"max_enrichments": "<MAX_ENRICHMENTS>"

Fixed in commit cffbcd94 — discovery_depth, max_discoveries, enable_cortex, cortex_model, and max_enrichments all use descriptive angle-bracket placeholders now.

Copilot · 2026-04-13T09:45:00Z

+    "snowflake_account": "<your-snowflake-account>.snowflakecomputing.com",
+    "snowflake_pat_token": "<your-snowflake-pat-token>",


snowflake_account placeholder should be a single fully-enclosed angle-bracket value (e.g., <YOUR_SNOWFLAKE_ACCOUNT_HOSTNAME>), not a mix of placeholder + literal suffix (<your-snowflake-account>.snowflakecomputing.com). Also prefer consistent, descriptive uppercase placeholders (and similarly for snowflake_pat_token).

Suggested change

"snowflake_account": "<your-snowflake-account>.snowflakecomputing.com",

"snowflake_pat_token": "<your-snowflake-pat-token>",

"snowflake_account": "<YOUR_SNOWFLAKE_ACCOUNT_HOSTNAME>",

"snowflake_pat_token": "<YOUR_SNOWFLAKE_PAT_TOKEN>",

Fixed in commit cffbcd94 — now uses single enclosed placeholder <YOUR_SNOWFLAKE_ACCOUNT_HOSTNAME>.

Copilot · 2026-04-13T09:45:01Z

+        configuration: a dictionary that holds the configuration settings for the connector.
+    """
+    return [
+        {"table": "recalls", "primary_key": ["nhtsa_campaign_number", "make", "model"]},


The recalls table primary key omits model_year. Since the connector can fetch the same make/model across multiple years (seed + discovered vehicles), the same nhtsa_campaign_number can apply to multiple model years and would overwrite prior rows. Consider including model_year (or another per-vehicle identifier) in the primary key to preserve per-year recall applicability.

Suggested change

{"table": "recalls", "primary_key": ["nhtsa_campaign_number", "make", "model"]},

{"table": "recalls", "primary_key": ["nhtsa_campaign_number", "make", "model", "model_year"]},

Fixed in commit cffbcd94 — primary key is now ["nhtsa_campaign_number", "make", "model", "model_year"] to prevent cross-year overwrites when the same campaign spans multiple model years.

Copilot · 2026-04-13T09:45:01Z

+    try:
+        response = requests.post(url, headers=headers, json=payload, timeout=timeout, stream=True)
+        response.raise_for_status()
+
+        agent_response = ""
+        for line in response.iter_lines():
+            if line:
+                line_text = line.decode("utf-8")
+                if line_text.startswith("data: "):
+                    try:
+                        data = json.loads(line_text[6:])
+                        if not isinstance(data, dict):
+                            continue
+
+                        choices = data.get("choices", [])
+                        for choice in choices:
+                            delta = choice.get("delta", {})
+                            content = delta.get("content", "")
+                            if content:
+                                agent_response += content
+
+                    except json.JSONDecodeError:
+                        continue
+
+                elif line_text.startswith("event: done"):
+                    break
+
+        if not agent_response:
+            log.warning("Empty response from Cortex Agent")
+            return None
+
+        # Parse the JSON response, handling potential markdown fences
+        cleaned = agent_response.strip()
+        if cleaned.startswith("```"):
+            # Remove markdown code fences
+            lines = cleaned.split("\n")
+            lines = [ln for ln in lines if not ln.strip().startswith("```")]
+            cleaned = "\n".join(lines).strip()
+
+        return json.loads(cleaned)
+
+    except requests.exceptions.Timeout:
+        log.warning(f"Cortex Agent timeout after {timeout}s")
+        return None
+    except json.JSONDecodeError as e:
+        log.warning(f"Failed to parse Cortex Agent response as JSON: {e}")
+        log.warning(f"Raw response: {agent_response[:500]}")
+        return None
+    except Exception as e:
+        log.warning(f"Cortex Agent error: {e}")
+        return None
+
+


call_cortex_agent() makes a single requests.post(...) call with no retry/backoff handling for transient failures (429/5xx, connection resets, etc.). For resiliency (and to match the retry logic used for NHTSA calls), add bounded exponential backoff retries for retryable status codes and requests transient exceptions, and fail fast on auth errors (401/403).

Suggested change

try:

response = requests.post(url, headers=headers, json=payload, timeout=timeout, stream=True)

response.raise_for_status()

agent_response = ""

for line in response.iter_lines():

if line:

line_text = line.decode("utf-8")

if line_text.startswith("data: "):

try:

data = json.loads(line_text[6:])

if not isinstance(data, dict):

continue

choices = data.get("choices", [])

for choice in choices:

delta = choice.get("delta", {})

content = delta.get("content", "")

if content:

agent_response += content

except json.JSONDecodeError:

continue

elif line_text.startswith("event: done"):

break

if not agent_response:

log.warning("Empty response from Cortex Agent")

return None

# Parse the JSON response, handling potential markdown fences

cleaned = agent_response.strip()

if cleaned.startswith("```"):

# Remove markdown code fences

lines = cleaned.split("\n")

lines = [ln for ln in lines if not ln.strip().startswith("```")]

cleaned = "\n".join(lines).strip()

return json.loads(cleaned)

except requests.exceptions.Timeout:

log.warning(f"Cortex Agent timeout after {timeout}s")

return None

except json.JSONDecodeError as e:

log.warning(f"Failed to parse Cortex Agent response as JSON: {e}")

log.warning(f"Raw response: {agent_response[:500]}")

return None

except Exception as e:

log.warning(f"Cortex Agent error: {e}")

return None

agent_response = ""

for attempt in range(__MAX_RETRIES):

try:

response = requests.post(url, headers=headers, json=payload, timeout=timeout, stream=True)

if response.status_code in (401, 403):

log.warning(f"Cortex Agent authentication failed with status {response.status_code}")

return None

if response.status_code in __RETRYABLE_STATUS_CODES:

if attempt == __MAX_RETRIES - 1:

log.warning(

f"Cortex Agent request failed with retryable status {response.status_code} "

f"after {__MAX_RETRIES} attempts"

)

return None

delay_seconds = min(60, __BASE_DELAY_SECONDS * (2 ** attempt))

log.warning(

f"Cortex Agent request returned retryable status {response.status_code}; "

f"retrying in {delay_seconds}s (attempt {attempt + 1}/{__MAX_RETRIES})"

)

time.sleep(delay_seconds)

continue

response.raise_for_status()

agent_response = ""

for line in response.iter_lines():

if line:

line_text = line.decode("utf-8")

if line_text.startswith("data: "):

try:

data = json.loads(line_text[6:])

if not isinstance(data, dict):

continue

choices = data.get("choices", [])

for choice in choices:

delta = choice.get("delta", {})

content = delta.get("content", "")

if content:

agent_response += content

except json.JSONDecodeError:

continue

elif line_text.startswith("event: done"):

break

if not agent_response:

log.warning("Empty response from Cortex Agent")

return None

# Parse the JSON response, handling potential markdown fences

cleaned = agent_response.strip()

if cleaned.startswith("```"):

# Remove markdown code fences

lines = cleaned.split("\n")

lines = [ln for ln in lines if not ln.strip().startswith("```")]

cleaned = "\n".join(lines).strip()

return json.loads(cleaned)

except requests.exceptions.Timeout:

if attempt == __MAX_RETRIES - 1:

log.warning(f"Cortex Agent timeout after {timeout}s on final attempt")

return None

delay_seconds = min(60, __BASE_DELAY_SECONDS * (2 ** attempt))

log.warning(

f"Cortex Agent timeout after {timeout}s; retrying in {delay_seconds}s "

f"(attempt {attempt + 1}/{__MAX_RETRIES})"

)

time.sleep(delay_seconds)

except requests.exceptions.ConnectionError as e:

if attempt == __MAX_RETRIES - 1:

log.warning(f"Cortex Agent connection error after {__MAX_RETRIES} attempts: {e}")

return None

delay_seconds = min(60, __BASE_DELAY_SECONDS * (2 ** attempt))

log.warning(

f"Cortex Agent connection error: {e}; retrying in {delay_seconds}s "

f"(attempt {attempt + 1}/{__MAX_RETRIES})"

)

time.sleep(delay_seconds)

except requests.exceptions.HTTPError as e:

status_code = e.response.status_code if e.response is not None else None

log.warning(f"Cortex Agent HTTP error {status_code}: {e}")

return None

except requests.exceptions.RequestException as e:

if attempt == __MAX_RETRIES - 1:

log.warning(f"Cortex Agent request failed after {__MAX_RETRIES} attempts: {e}")

return None

delay_seconds = min(60, __BASE_DELAY_SECONDS * (2 ** attempt))

log.warning(

f"Cortex Agent transient request error: {e}; retrying in {delay_seconds}s "

f"(attempt {attempt + 1}/{__MAX_RETRIES})"

)

time.sleep(delay_seconds)

except json.JSONDecodeError as e:

log.warning(f"Failed to parse Cortex Agent response as JSON: {e}")

log.warning(f"Raw response: {agent_response[:500]}")

return None

return None

Fixed in commit cffbcd94 — call_cortex_agent() now has a bounded retry loop (3 attempts) with exponential backoff for retryable status codes (429, 500, 502, 503, 504), connection errors, and timeouts. Matches the retry pattern used for NHTSA API calls.

Copilot · 2026-04-13T09:45:01Z

+    except Exception as e:
+        log.warning(f"Cortex Agent error: {e}")


The final except Exception as e: in call_cortex_agent() will also swallow non-network programming errors (e.g., unexpected TypeError), making failures hard to diagnose. Prefer catching requests.exceptions.RequestException (and ValueError/json.JSONDecodeError as needed) and letting unexpected exceptions propagate or be logged as log.severe and re-raised.

Suggested change

except Exception as e:

log.warning(f"Cortex Agent error: {e}")

except requests.exceptions.RequestException as e:

log.warning(f"Cortex Agent request failed: {e}")

Fixed in commit cffbcd94 — removed generic except Exception. Now catches requests.exceptions.HTTPError, ConnectionError, Timeout, RequestException, and json.JSONDecodeError separately with appropriate handling for each. Unexpected errors will propagate.

Copilot · 2026-04-13T09:45:02Z

+    # Fetch data for agent-recommended vehicles
+    recommended = discovery_result.get("recommended_vehicles", [])
+    vehicles_to_fetch = recommended[:max_discoveries]
+
+    log.info(f"Agent recommended {len(recommended)} vehicles, fetching {len(vehicles_to_fetch)}")
+
+    for vehicle in vehicles_to_fetch:
+        v_make = vehicle.get("make", "")


recommended_vehicles is assumed to be a list (sliced and iterated) but comes from an LLM response, so it can easily be the wrong type (string/dict/null) and crash the sync. Add a type check (ensure list of dicts) before slicing/looping; otherwise treat as empty and skip discovery fetches.

Fixed in commit cffbcd94 — recommended_vehicles is now validated as a list before iteration, and each element is validated as a dict before accessing keys. Returns early with a warning if the LLM returns an unexpected type.

Copilot · 2026-04-13T09:45:02Z

+    all_recalls = list(seed_recalls)
+    all_complaints = list(seed_complaints)
+    vehicles_investigated = [(seed_make, seed_model, seed_year)]


run_discovery_phase() materializes all_recalls/all_complaints as full in-memory lists and keeps extending them for discovered vehicles, but the synthesis prompt only needs aggregate counts/component frequencies. This can grow large (complaints especially) and is avoidable—consider tracking aggregates instead of storing every raw record in memory.

Fixed in commit cffbcd94 — replaced all_recalls/all_complaints in-memory lists with recall_components/complaint_components frequency dicts and aggregate counters (total_recalls, total_complaints, crash_count, fire_count). The synthesis prompt now uses pre-computed aggregates via _build_aggregates() instead of iterating full record lists.

Copilot · 2026-04-13T09:45:02Z

+        except Exception as e:
+            log.warning(
+                f"Failed to fetch data for discovered vehicle "
+                f"{v_make} {v_model} {v_year}: {e}. Skipping."
+            )


The per-vehicle fetch block uses except Exception which can hide programming errors and silently skip vehicles. Prefer catching the specific expected failures (e.g., RuntimeError from fetch_data_with_retry / requests.exceptions.RequestException) and either re-raise unexpected exceptions or log them as log.severe with enough context for debugging.

Suggested change

except Exception as e:

log.warning(

f"Failed to fetch data for discovered vehicle "

f"{v_make} {v_model} {v_year}: {e}. Skipping."

)

except (RuntimeError, requests.exceptions.RequestException) as e:

log.warning(

f"Failed to fetch data for discovered vehicle "

f"{v_make} {v_model} {v_year}: {e}. Skipping."

)

except Exception as e:

log.severe(

f"Unexpected error while processing discovered vehicle "

f"{v_make} {v_model} {v_year}: {e}"

)

raise

Fixed in commit cffbcd94 — now catches (RuntimeError, requests.exceptions.RequestException) specifically instead of generic Exception. Unexpected programming errors will propagate rather than being silently swallowed.

- All configuration.json values now use angle-bracket placeholders - Enforce max_enrichments budget with counter before each Cortex call - Add model_year to recalls primary key to prevent cross-year overwrites - Add retry logic with exponential backoff to call_cortex_agent() - Replace generic except Exception with specific exception types - Type-check recommended_vehicles from LLM output (validate list of dicts) - Replace unbounded in-memory lists with aggregate component tracking - Per-vehicle fetch catches RuntimeError/RequestException, not Exception - Switch default Cortex model from llama3.3-70b to claude-sonnet-4-6 - Update README to match all code changes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

kellykohlleffel · 2026-04-13T11:40:09Z

All 10 Copilot review comments addressed in commit cffbcd94. Each thread has been replied to individually above. Here is the summary:

Commit cffbcd94:

Fixes 1-3: All configuration.json values now use angle-bracket placeholders
Fix 4: max_enrichments actively enforced with enrichment_count tracker
Fix 5: model_year added to recalls primary key
Fixes 6-7: call_cortex_agent() now has retry logic with exponential backoff + specific exception types
Fix 8: LLM output type-validated (recommended_vehicles must be list of dicts)
Fix 9: Unbounded in-memory lists replaced with aggregate component tracking
Fix 10: Per-vehicle fetch uses specific exceptions (RuntimeError, RequestException)
Default Cortex model switched from llama3.3-70b to claude-sonnet-4-6
README updated to match all code changes

fivetran debug results:

SDK Version: 2.25.1230.001
Sync: SUCCEEDED
Records: 935 upserts (Toyota Tundra 2022 seed + 3 discovered vehicles, discovery_depth=2, max_enrichments=3)

Table	Records	Description
recalls	50	Recall campaigns across seed + discovered vehicles
complaints	731	Consumer safety complaints
vehicle_specs	174	Vehicle model specifications (3 makes)
discovery_insights	1	Agent discovery analysis
safety_analysis	1	Cross-vehicle synthesis (fleet grade: C+)
Total	935	Agent discovered: Toyota Sequoia 2022, Toyota Tundra 2023, Ford F-150 2022

fivetran deploy: SUCCEEDED — connection woozy_challenging, initial sync completed successfully.

Raw fivetran debug terminal output

13-Apr 06:26:15.913 WARNING ⚡ sdk `requirements.txt` file not found in your project folder. 
13-Apr 06:26:17.360 INFO ⚡ sdk Debugging connector at: /Users/kelly.kohlleffel/Documents/GitHub/fivetran_connector_sdk_personal/contributions/snowflake-cortex-code-nhtsa-safety-intelligence 
13-Apr 06:26:17.371 INFO ⚡ sdk Running connector tester... 
13-Apr 06:26:17.832 INFO ⚡ debugger Version: 2.25.1230.001 
13-Apr 06:26:17.840 INFO ⚡ debugger Destination schema: /Users/kelly.kohlleffel/Documents/GitHub/fivetran_connector_sdk_personal/contributions/snowflake-cortex-code-nhtsa-safety-intelligence/files/warehouse.db/tester 
13-Apr 06:26:18.967 INFO ⚡ debugger Previous state:
{} 
Apr 13, 2026 6:26:20 AM com.fivetran.partner_sdk.client.connector.PartnerSdkConnectorClient schema
INFO: Fetching schema from partner
13-Apr 06:26:20.038 INFO ⚡ sdk Initiating the 'schema' method call... 
13-Apr 06:26:20.048 INFO ⚡ debugger [SchemaChange]: tester.recalls 
13-Apr 06:26:20.048 INFO ⚡ debugger [SchemaChange]: tester.complaints 
13-Apr 06:26:20.048 INFO ⚡ debugger [SchemaChange]: tester.vehicle_specs 
13-Apr 06:26:20.049 INFO ⚡ debugger [SchemaChange]: tester.discovery_insights 
13-Apr 06:26:20.049 INFO ⚡ debugger [SchemaChange]: tester.safety_analysis 
13-Apr 06:26:20.052 INFO ⚡ sdk Initiating the 'update' method call... 
13-Apr 06:26:20.053 WARNING Example: all_things_ai/tutorials : snowflake-cortex-code-nhtsa-safety-intelligence 
13-Apr 06:26:20.053 INFO Phase 1: Fetching seed vehicle data for Toyota Tundra 2022 
13-Apr 06:26:20.459 INFO Fetched 13 recalls for Toyota Tundra 2022 
13-Apr 06:26:22.618 INFO Fetched 394 complaints for Toyota Tundra 2022 
13-Apr 06:26:23.381 INFO Fetched 58 model specs for make: Toyota 
13-Apr 06:26:23.894 INFO Phase 1 complete: 13 recalls, 394 complaints, 58 vehicle specs 
13-Apr 06:26:23.895 INFO Phase 2: Starting Agent-Driven Discovery 
13-Apr 06:26:23.895 INFO Calling Cortex Agent for discovery analysis 
13-Apr 06:26:23.934 INFO ⚡ debugger [CreateTable]: tester.recalls 
13-Apr 06:26:23.985 INFO ⚡ debugger [CreateTable]: tester.complaints 
13-Apr 06:26:24.491 INFO ⚡ debugger [CreateTable]: tester.vehicle_specs 
13-Apr 06:26:24.549 INFO ⚡ debugger Checkpoint: {} 
13-Apr 06:26:40.644 INFO Agent recommended 3 vehicles, fetching 3 
13-Apr 06:26:40.648 INFO Fetching discovered vehicle: Toyota Sequoia 2022 
13-Apr 06:26:40.684 INFO ⚡ debugger [CreateTable]: tester.discovery_insights 
13-Apr 06:26:40.687 INFO ⚡ debugger Checkpoint: {} 
13-Apr 06:26:41.022 INFO Fetched 2 recalls for Toyota Sequoia 2022 
13-Apr 06:26:42.059 INFO Fetched 1 complaints for Toyota Sequoia 2022 
13-Apr 06:26:42.564 INFO Discovered vehicle Toyota Sequoia 2022: 2 recalls, 1 complaints 
13-Apr 06:26:42.697 INFO Fetched 58 model specs for make: Toyota 
13-Apr 06:26:43.201 INFO Fetching discovered vehicle: Toyota Tundra 2023 
13-Apr 06:26:43.298 INFO ⚡ debugger Checkpoint: {} 
13-Apr 06:26:43.449 INFO Fetched 13 recalls for Toyota Tundra 2023 
13-Apr 06:26:45.454 INFO Fetched 336 complaints for Toyota Tundra 2023 
13-Apr 06:26:45.966 INFO Discovered vehicle Toyota Tundra 2023: 13 recalls, 336 complaints 
13-Apr 06:26:46.099 INFO Fetched 58 model specs for make: Toyota 
13-Apr 06:26:46.606 INFO Fetching discovered vehicle: Ford F-150 2022 
13-Apr 06:26:46.758 INFO ⚡ debugger Checkpoint: {} 
13-Apr 06:26:46.926 INFO Fetched 22 recalls for Ford F-150 2022 
13-Apr 06:26:47.722 SEVERE Request failed after 1 attempt(s). URL: https://api.nhtsa.gov/complaints/complaintsByVehicle, Status: 400, Error: 400 Client Error: Bad Request for url: https://api.nhtsa.gov/complaints/complaintsByVehicle?make=Ford&model=F-150&modelYear=2022 
13-Apr 06:26:47.722 WARNING Failed to fetch data for discovered vehicle Ford F-150 2022: API request failed after 1 attempt(s): 400 Client Error: Bad Request for url: https://api.nhtsa.gov/complaints/complaintsByVehicle?make=Ford&model=F-150&modelYear=2022. Skipping. 
13-Apr 06:26:47.723 INFO Phase 3: Starting cross-vehicle synthesis 
13-Apr 06:26:47.723 INFO Calling Cortex Agent for cross-vehicle synthesis 
13-Apr 06:26:47.724 INFO ⚡ debugger Checkpoint: {} 
13-Apr 06:27:11.194 INFO Synthesis complete: fleet grade C+, 3 vehicles analyzed 
13-Apr 06:27:11.194 INFO Sync complete for all phases 
13-Apr 06:27:11.219 INFO ⚡ debugger [CreateTable]: tester.safety_analysis 
13-Apr 06:27:11.222 INFO ⚡ debugger Checkpoint: {} 
13-Apr 06:27:11.222 INFO ⚡ debugger Checkpoint: {"last_sync_make": "Toyota", "last_sync_model": "Tundra", "last_sync_year": "2022"} 
13-Apr 06:27:11.225 INFO ⚡ debugger SYNC PROGRESS:
Operation       | Calls     
----------------+------------
Upserts         | 935       
Updates         | 0         
Deletes         | 0         
Truncates       | 0         
SchemaChanges   | 5         
Checkpoints     | 7         
Note: Fivetran debug's performance is limited by your local machine's resources. Your connector will run faster in production.
read about production system resources at https://fivetran.com/docs/connector-sdk/working-with-connector-sdk#systemresources 
13-Apr 06:27:11.225 INFO ⚡ debugger Sync SUCCEEDED

All review feedback has been addressed. Ready for re-review.

@fivetran-anushkaparashar

…README updates Inspired by @fivetran-anushkaparashar's review feedback on PR fivetran#562 (NVD/CVE), applying the same improvements proactively to the NHTSA connector: - Add _create_cortex_session() for TCP connection pooling across discovery and synthesis Cortex calls (separate from NHTSA session, different auth) - call_cortex_agent() accepts optional cortex_session parameter - Streaming responses explicitly closed via response.close() in finally block - Cortex session created in update(), shared across both phases, closed after - README error handling section updated to document connection pooling and response cleanup - Full test harness passed (5 scenarios) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

kellykohlleffel · 2026-04-14T15:33:26Z

Proactive improvements in commit 83f7736e, inspired by @fivetran-anushkaparashar's excellent review feedback on PR #562 (NVD/CVE connector). Applying the same patterns here before they're flagged:

Cortex connection pooling: Added _create_cortex_session(configuration) for TCP connection reuse across the discovery and synthesis Cortex calls. Separate session from NHTSA (different auth headers). call_cortex_agent() now accepts optional cortex_session parameter.
Streaming response cleanup: response.close() in a finally block after iter_lines() completes.
README updated to document both changes in the error handling section.

fivetran deploy: SUCCEEDED — connection woozy_challenging (nhtsa_safety_intelligence)
Production sync: SUCCEEDED — 937 extracted rows, 5 tables, 4m 32s

Table	Extracted Rows
complaints	733
vehicle_specs	174
recalls	28
discovery_insights	1
safety_analysis	1
Total	937

Pre-Submission Test Harness Results

Scenario	Result	Evidence
1: First Sync	PASS	937 upserts, 5 tables, 7 checkpoints
4: AI Failure	PASS	401 logged, NHTSA data still populated (465 upserts), SUCCEEDED
5: Code Review	PASS	Connection pooling, response.close(), no generic Exception

Raw fivetran debug terminal output

14-Apr 10:15:28.574 WARNING ⚡ sdk `requirements.txt` file not found in your project folder. 
14-Apr 10:15:30.106 INFO ⚡ sdk Debugging connector at: /Users/kelly.kohlleffel/Documents/GitHub/fivetran_connector_sdk_personal/contributions/snowflake-cortex-code-nhtsa-safety-intelligence 
14-Apr 10:15:30.120 INFO ⚡ sdk Running connector tester... 
14-Apr 10:15:30.519 INFO ⚡ debugger Version: 2.25.1230.001 
14-Apr 10:15:30.528 INFO ⚡ debugger Destination schema: /Users/kelly.kohlleffel/Documents/GitHub/fivetran_connector_sdk_personal/contributions/snowflake-cortex-code-nhtsa-safety-intelligence/files/warehouse.db/tester 
14-Apr 10:15:31.651 INFO ⚡ debugger Previous state:
{} 
Apr 14, 2026 10:15:32 AM com.fivetran.partner_sdk.client.connector.PartnerSdkConnectorClient schema
INFO: Fetching schema from partner
14-Apr 10:15:32.736 INFO ⚡ sdk Initiating the 'schema' method call... 
14-Apr 10:15:32.747 INFO ⚡ debugger [SchemaChange]: tester.recalls 
14-Apr 10:15:32.747 INFO ⚡ debugger [SchemaChange]: tester.complaints 
14-Apr 10:15:32.747 INFO ⚡ debugger [SchemaChange]: tester.vehicle_specs 
14-Apr 10:15:32.747 INFO ⚡ debugger [SchemaChange]: tester.discovery_insights 
14-Apr 10:15:32.747 INFO ⚡ debugger [SchemaChange]: tester.safety_analysis 
14-Apr 10:15:32.752 INFO ⚡ sdk Initiating the 'update' method call... 
14-Apr 10:15:32.752 WARNING Example: all_things_ai/tutorials : snowflake-cortex-code-nhtsa-safety-intelligence 
14-Apr 10:15:32.752 INFO Phase 1: Fetching seed vehicle data for Toyota Tundra 2022 
14-Apr 10:15:33.240 INFO Fetched 13 recalls for Toyota Tundra 2022 
14-Apr 10:15:36.894 INFO Fetched 394 complaints for Toyota Tundra 2022 
14-Apr 10:15:37.846 INFO Fetched 58 model specs for make: Toyota 
14-Apr 10:15:38.358 INFO Phase 1 complete: 13 recalls, 394 complaints, 58 vehicle specs 
14-Apr 10:15:38.359 INFO Phase 2: Starting Agent-Driven Discovery 
14-Apr 10:15:38.359 INFO Calling Cortex Agent for discovery analysis 
14-Apr 10:15:38.411 INFO ⚡ debugger [CreateTable]: tester.recalls 
14-Apr 10:15:38.471 INFO ⚡ debugger [CreateTable]: tester.complaints 
14-Apr 10:15:39.089 INFO ⚡ debugger [CreateTable]: tester.vehicle_specs 
14-Apr 10:15:39.158 INFO ⚡ debugger Checkpoint: {} 
14-Apr 10:16:01.540 INFO Agent recommended 3 vehicles, fetching 3 
14-Apr 10:16:01.540 INFO Fetching discovered vehicle: Toyota Sequoia 2022 
14-Apr 10:16:01.566 INFO ⚡ debugger [CreateTable]: tester.discovery_insights 
14-Apr 10:16:01.572 INFO ⚡ debugger Checkpoint: {} 
14-Apr 10:16:01.723 INFO Fetched 2 recalls for Toyota Sequoia 2022 
14-Apr 10:16:03.131 INFO Fetched 1 complaints for Toyota Sequoia 2022 
14-Apr 10:16:03.635 INFO Discovered vehicle Toyota Sequoia 2022: 2 recalls, 1 complaints 
14-Apr 10:16:03.803 INFO Fetched 58 model specs for make: Toyota 
14-Apr 10:16:04.309 INFO Fetching discovered vehicle: Toyota Tundra 2023 
14-Apr 10:16:04.424 INFO ⚡ debugger Checkpoint: {} 
14-Apr 10:16:04.751 INFO Fetched 13 recalls for Toyota Tundra 2023 
14-Apr 10:16:07.976 INFO Fetched 338 complaints for Toyota Tundra 2023 
14-Apr 10:16:08.491 INFO Discovered vehicle Toyota Tundra 2023: 13 recalls, 338 complaints 
14-Apr 10:16:08.850 INFO Fetched 58 model specs for make: Toyota 
14-Apr 10:16:09.353 INFO Fetching discovered vehicle: Ram 1500 2022 
14-Apr 10:16:09.518 INFO ⚡ debugger Checkpoint: {} 
14-Apr 10:16:09.879 INFO Fetched 15 recalls for Ram 1500 2022 
14-Apr 10:16:10.910 SEVERE Request failed after 1 attempt(s). URL: https://api.nhtsa.gov/complaints/complaintsByVehicle, Status: 400, Error: 400 Client Error: Bad Request for url: https://api.nhtsa.gov/complaints/complaintsByVehicle?make=Ram&model=1500&modelYear=2022 
14-Apr 10:16:10.910 WARNING Failed to fetch data for discovered vehicle Ram 1500 2022: API request failed after 1 attempt(s): 400 Client Error: Bad Request for url: https://api.nhtsa.gov/complaints/complaintsByVehicle?make=Ram&model=1500&modelYear=2022. Skipping. 
14-Apr 10:16:10.910 INFO Phase 3: Starting cross-vehicle synthesis 
14-Apr 10:16:10.910 INFO Calling Cortex Agent for cross-vehicle synthesis 
14-Apr 10:16:10.912 INFO ⚡ debugger Checkpoint: {} 
14-Apr 10:16:36.745 INFO Synthesis complete: fleet grade C+, 3 vehicles analyzed 
14-Apr 10:16:36.751 INFO Sync complete for all phases 
14-Apr 10:16:36.771 INFO ⚡ debugger [CreateTable]: tester.safety_analysis 
14-Apr 10:16:36.774 INFO ⚡ debugger Checkpoint: {} 
14-Apr 10:16:36.775 INFO ⚡ debugger Checkpoint: {"last_sync_make": "Toyota", "last_sync_model": "Tundra", "last_sync_year": "2022"} 
14-Apr 10:16:36.776 INFO ⚡ debugger SYNC PROGRESS:
Operation       | Calls     
----------------+------------
Upserts         | 937       
Updates         | 0         
Deletes         | 0         
Truncates       | 0         
SchemaChanges   | 5         
Checkpoints     | 7         
Note: Fivetran debug's performance is limited by your local machine's resources. Your connector will run faster in production.
read about production system resources at https://fivetran.com/docs/connector-sdk/working-with-connector-sdk#systemresources 
14-Apr 10:16:36.776 INFO ⚡ debugger Sync SUCCEEDED

fivetran-sahilkhirwal

looks good

fivetran-sahilkhirwal · 2026-04-17T08:41:25Z

Nit: Thanks a lot for contributing multiple cortex examples to the repository. we can create a directory in the tutorials titled snowflake_cortex and move all the cortex tutorials there. After merging all the PRs, we can do this refactoring!

kellykohlleffel · 2026-04-17T13:29:16Z

I love that idea!

kellykohlleffel requested review from a team as code owners April 12, 2026 03:35

github-actions bot added size/XXL PR size: extra extra large ai-assisted/unknown risk/unknown labels Apr 12, 2026

Fix black formatting: remove extra blank line after imports

e8346ce

fivetran-sahilkhirwal assigned kellykohlleffel Apr 13, 2026

fivetran-sahilkhirwal requested a review from Copilot April 13, 2026 09:38

Copilot started reviewing on behalf of fivetran-sahilkhirwal April 13, 2026 09:39 View session

fivetran-sahilkhirwal self-requested a review April 13, 2026 09:42

Copilot AI reviewed Apr 13, 2026

View reviewed changes

kellykohlleffel mentioned this pull request Apr 13, 2026

Add Snowflake Cortex NVD CVE Threat Intelligence connector to AI tutorials #562

Open

7 tasks

fivetran-rishabhghosh removed the request for review from a team April 13, 2026 18:29

fivetran-sahilkhirwal approved these changes Apr 17, 2026

View reviewed changes

		"snowflake_account": "<your-snowflake-account>.snowflakecomputing.com",
		"snowflake_pat_token": "<your-snowflake-pat-token>",

	{"table": "recalls", "primary_key": ["nhtsa_campaign_number", "make", "model"]},
	{"table": "recalls", "primary_key": ["nhtsa_campaign_number", "make", "model", "model_year"]},

		except Exception as e:
		log.warning(f"Cortex Agent error: {e}")

Conversation

kellykohlleffel commented Apr 12, 2026

Summary

What it does

Key details

Files

Testing

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kellykohlleffel commented Apr 13, 2026

Uh oh!

kellykohlleffel commented Apr 14, 2026

Uh oh!

fivetran-sahilkhirwal left a comment

Choose a reason for hiding this comment

Uh oh!

fivetran-sahilkhirwal commented Apr 17, 2026

Uh oh!

kellykohlleffel commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants