Skip to content

Conversation

@mbani01
Copy link
Contributor

@mbani01 mbani01 commented Jan 21, 2026

This pull request updates several SQL pipeline files to replace usage of the segmentRepositories table with the newer repositories table. The changes standardize repository lookups, simplify queries, and ensure consistent filtering of repositories based on their excluded and archived status. These updates affect activity relation, monitoring, insights, and pull request filtering pipelines.

Repository Table Migration and Query Simplification:

  • Replaced all references to segmentRepositories with repositories in activityRelations_bucket_clean_enrich_copy_pipe_0.pipe through activityRelations_bucket_clean_enrich_copy_pipe_9.pipe to standardize repository filtering by excluded status. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

  • Simplified repository filtering in activityRelations_deduplicated_cleaned_copy_pipe.pipe and activityRelations_enrich_clean_initial_snapshot.pipe by directly querying url from repositories where deletedAt is null, removing the need for segment-based joins. [1] [2]

Monitoring and Insights Pipeline Updates:

  • Updated monitoring_entities.pipe to monitor the repositories table instead of segmentRepositories, including renaming nodes and updating metrics queries. [1] [2]

  • Modified insights_projects_populated_copy.pipe to use url from repositories for grouping archived and excluded repositories, instead of the old repository field.

Pull Request and Security Filtering:

  • Updated pull_requests_filtered.pipe to filter by url from repositories for non-excluded repositories, replacing the previous segment-based filtering.

  • Changed security_deduplicated_merged_copy_pipe.pipe to filter by insightsProjectId from repositories instead of segmentRepositories, ensuring only non-excluded repositories are considered.


Note

Standardizes repository sourcing by migrating pipes from segmentRepositories to repositories, aligning filters to excluded=false and using url where applicable.

  • Updated activityRelations_* copy/enrich pipes (0–9, dedup, initial snapshot) to filter by segmentId present in repositories rather than segmentRepositories
  • Switched PR filtering default scope to SELECT url FROM repositories WHERE excluded = false in pull_requests_filtered.pipe
  • Replaced archived/excluded repository aggregation to use repositories (url) in insights_projects_populated_copy.pipe; also filters out deleted collectionsInsightsProjects rows
  • Monitoring now tracks repositories (renamed node and query) in monitoring_entities.pipe
  • Security evaluations now limit insightsProjectId via non-excluded repositories in security_deduplicated_merged_copy_pipe.pipe

Written by Cursor Bugbot for commit 36d057c. This will update automatically on new commits. Configure here.

@mbani01 mbani01 requested review from gaspergrom and ulemons January 21, 2026 16:47
@mbani01 mbani01 self-assigned this Jan 21, 2026
@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

2 similar comments
@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

@mbani01 mbani01 changed the title chore(tinybird): update tinybird pipes to use repositories chore(tinybird): update tinybird pipes to use repositories [CM-902] Jan 21, 2026
cursor[bot]

This comment was marked as outdated.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

IN {{ Array(repos, 'String', description="Filter activity repo list", required=False) }}
{% else %}
AND pra.channel IN (SELECT repository FROM segmentRepositories where excluded = false)
AND pra.channel IN (SELECT url FROM repositories FINAL where excluded = false)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing NULL handling for excluded column filtering

Medium Severity

The query filters repositories with where excluded = false, but all other queries in this PR use WHERE (r.excluded IS NULL OR r.excluded = false). Repositories where excluded is NULL will be incorrectly filtered out in pull request queries, while they're correctly included in activity relations and security queries. This inconsistency causes different behavior for the same data.

Fix in Cursor Fix in Web

FROM segmentRepositories sr FINAL
WHERE (sr.excluded IS NULL OR sr.excluded = false)
FROM repositories r FINAL
WHERE (r.excluded IS NULL OR r.excluded = false)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing deletedAt filter for soft-deleted repositories

High Severity

The repositories table has a deletedAt column for soft-deletes (which segmentRepositories lacked), but the migrated queries only filter by excluded status. The PR description explicitly states queries should filter "where deletedAt is null", but this filter is missing from all updated queries. Soft-deleted repositories will incorrectly appear in analytics results, activity relations, pull request filtering, and security evaluations.

Additional Locations (2)

Fix in Cursor Fix in Web

@mbani01 mbani01 merged commit 1db4d97 into main Jan 22, 2026
16 checks passed
@mbani01 mbani01 deleted the chore/update-tinybird-pipes-to-use-repositories branch January 22, 2026 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants