Skip to content

MGMT-23553: Automate events table bloat cleanup#10140

Open
bluesort wants to merge 1 commit intoopenshift:masterfrom
bluesort:mgmt-23553
Open

MGMT-23553: Automate events table bloat cleanup#10140
bluesort wants to merge 1 commit intoopenshift:masterfrom
bluesort:mgmt-23553

Conversation

@bluesort
Copy link
Copy Markdown
Member

@bluesort bluesort commented Apr 14, 2026

Problem

The events table is high-churn with frequent inserts and deletes, resulting in lingering tombstones and index bloat and impacting query performance.

Solution

Configure aggressive autovacuum settings and add a reindex cronjob to periodically reclaim index bloat.

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Checked current bloat on the events table and adjusted autovacuum values to clean up before significant performance impact)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Summary by CodeRabbit

  • Chores
    • Database maintenance: added a reversible migration to tune autovacuum settings for the events table, with tests validating apply/rollback behavior.
  • New Features
    • Scheduled maintenance: added a configurable PostgreSQL reindex CronJob to periodically reindex the events table and report per-index and total size metrics, duration, and reclaimed-space statistics.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 14, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 14, 2026

@bluesort: This pull request references MGMT-23553 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Problem

The size and frequency of deletions in the events table results in a large amount of tombstone bloat.

Solution

Update autovacuum settings for the events table for earlier cleaning of tombstones.

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Checked current bloat on the events table and adjusted autovacuum values to clean up before significant performance impact)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Apr 14, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 14, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 14, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds a DB migration to tune autovacuum settings for the events table (apply and rollback), a Ginkgo test for apply/rollback/reapply, registers the migration in post-migrations, and adds a Kubernetes CronJob template to reindex public.events.

Changes

Cohort / File(s) Summary
Events Autovacuum Migration
internal/migrations/20260413120000_set_events_autovacuum_settings.go
New migration that sets autovacuum_vacuum_scale_factor = 0.05 and autovacuum_analyze_scale_factor = 0.025 on public.events in Migrate and issues ALTER TABLE ... RESET in Rollback; errors are returned.
Migration Test
internal/migrations/20260413120000_set_events_autovacuum_settings_test.go
New Ginkgo/Gomega test verifying initial unset state, migration application yields 0.05/0.025, rollback resets options, and re-application restores the settings by querying pg_options_to_table(reloptions).
Migration Registration
internal/migrations/migrations.go
Adds setEventsAutovacuumSettings() to the post() migration list after populatePrimaryIPStackForExistingClusters().
Postgres Reindex CronJob
deploy/postgres/postgres-reindex.cronjob.yaml.j2
New batch/v1 CronJob template postgres-reindex that sources DB creds from a Secret, validates connectivity, logs per-index and total sizes for public.events, runs REINDEX TABLE CONCURRENTLY public.events, measures/prints size delta and duration, and exposes templated resource and scheduling fields.

Sequence Diagram(s)

sequenceDiagram
    participant Cron as "K8s CronJob"
    participant Job as "CronJob -> Job"
    participant Pod as "Reindex Pod"
    participant DB as "PostgreSQL"

    Cron->>Job: schedule triggers Job
    Job->>Pod: create Pod (env from Secret, image, resources)
    Pod->>DB: psql connection check (PGHOST/PGUSER/PGPASSWORD)
    Pod->>DB: SELECT per-index sizes for public.events (before)
    Pod->>DB: REINDEX TABLE CONCURRENTLY public.events
    Pod->>DB: SELECT per-index sizes for public.events (after)
    Pod->>Pod: compute size delta and log results
    Pod->>Job: exit with success/failure
    Job->>Cron: Job completes
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 8 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ⚠️ Warning Test combines multiple unrelated behaviors in one block and lacks meaningful assertion failure messages. Split test into separate blocks for each behavior and add descriptive messages to all assertions.
✅ Passed checks (8 passed)
Check name Status Explanation
Stable And Deterministic Test Names ✅ Passed Test names use stable, deterministic descriptions without dynamic values like timestamps, UUIDs, or generated identifiers.
Microshift Test Compatibility ✅ Passed The test file uses only GORM and PostgreSQL queries, with no OpenShift-specific APIs or MicroShift-incompatible features.
Single Node Openshift (Sno) Test Compatibility ✅ Passed The newly added test file is a database migration unit test using Ginkgo that validates PostgreSQL schema changes in an isolated test database environment. It does not interact with Kubernetes infrastructure, make multi-node cluster assumptions, or perform cluster topology checks.
Topology-Aware Scheduling Compatibility ✅ Passed CronJob manifest contains no topology-specific scheduling constraints, affinity rules, nodeSelector, tolerations, or control-plane targeting; compatible with all OpenShift topologies.
Ote Binary Stdout Contract ✅ Passed PR does not introduce stdout writes in OTE binary process-level code. Migration and test files contain no process-level stdout operations; Kubernetes manifest is not a Go binary subject to the OTE contract.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed The test connects only to a local temporary PostgreSQL database and uses system catalog queries with no IPv4 assumptions or external connectivity requirements.
Title check ✅ Passed The title directly describes the main objective: automating events table bloat cleanup through autovacuum settings, which aligns with the migration, tests, and reindex cronjob changes in the changeset.
Description check ✅ Passed The PR description follows the required template structure with all major sections completed: problem statement, solution, issue categorization, environment impact, testing approach, and checklist items. The content is specific, meaningful, and clearly articulates the rationale for the changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 14, 2026
@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Apr 14, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bluesort

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 14, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 14, 2026

@bluesort: This pull request references MGMT-23553 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Problem

The size and frequency of deletions in the events table results in a large amount of tombstone bloat.

Solution

Update autovacuum settings for the events table for earlier cleaning of tombstones.

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Checked current bloat on the events table and adjusted autovacuum values to clean up before significant performance impact)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Summary by CodeRabbit

  • Chores
  • Database maintenance: Updated autovacuum configuration settings for the events table with a new migration and comprehensive test coverage.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 15, 2026

@bluesort: This pull request references MGMT-23553 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Problem

The size and frequency of deletions in the events table results in a large amount of tombstone bloat.

Solution

Update autovacuum settings for the events table for earlier cleaning of tombstones.

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Checked current bloat on the events table and adjusted autovacuum values to clean up before significant performance impact)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Summary by CodeRabbit

  • Chores
  • Database maintenance: added a migration to tune autovacuum settings for the events table and included test coverage to validate apply/rollback behavior.
  • New Features
  • Scheduled maintenance: added a configurable PostgreSQL reindex CronJob to periodically reindex the events table and report index-size metrics.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
deploy/postgres/postgres-reindex.cronjob.yaml.j2 (1)

25-26: Pin the PostgreSQL image to an immutable digest for reproducible scheduled runs.

Using a floating tag for this cronjob can cause behavior drift between runs if the image upstream is updated.

Proposed fix
-            image: registry.redhat.io/rhel9/postgresql-14
+            image: registry.redhat.io/rhel9/postgresql-14@sha256:<approved-digest>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deploy/postgres/postgres-reindex.cronjob.yaml.j2` around lines 25 - 26, The
cronjob currently uses a floating image tag in the image field (image:
registry.redhat.io/rhel9/postgresql-14); replace that with an immutable digest
reference (image:
registry.redhat.io/rhel9/postgresql-14@sha256:<CALCULATED_DIGEST>) so scheduled
runs are reproducible; obtain the correct sha256 digest for the exact image you
want (e.g., via skopeo/podman/docker inspect or registry API) and update the
template's image value accordingly, ensuring imagePullPolicy remains appropriate
for your registry.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deploy/postgres/postgres-reindex.cronjob.yaml.j2`:
- Around line 60-61: The PGSSLMODE environment variable is hardcoded to
"require"; change it to use a Jinja2 template variable so environments can
override the SSL mode. Replace the fixed value for the PGSSLMODE env entry with
a templated variable (e.g., pg_ssl_mode or postgres_ssl_mode) that defaults to
"require" using Jinja2's default filter, and ensure any deployment
values/helm/README references are updated to document the new template variable
so callers can set modes like "disable" or "verify-full".
- Around line 81-88: The monitoring SELECTs against pg_stat_user_indexes should
be schema-qualified by joining pg_class and pg_namespace rather than filtering a
non-existent schemaname column; update the queries that reference
pg_stat_user_indexes (the SELECTs that filter on relname = 'events') to JOIN
pg_class c ON c.oid = s.indextrelid and JOIN pg_namespace n ON n.oid =
c.relnamespace and then filter WHERE n.nspname = 'public' AND s.relname =
'events' (replace s with the table alias used in the diff); also make the
REINDEX command explicit by qualifying the target as public.events so it
operates on the intended table.

---

Nitpick comments:
In `@deploy/postgres/postgres-reindex.cronjob.yaml.j2`:
- Around line 25-26: The cronjob currently uses a floating image tag in the
image field (image: registry.redhat.io/rhel9/postgresql-14); replace that with
an immutable digest reference (image:
registry.redhat.io/rhel9/postgresql-14@sha256:<CALCULATED_DIGEST>) so scheduled
runs are reproducible; obtain the correct sha256 digest for the exact image you
want (e.g., via skopeo/podman/docker inspect or registry API) and update the
template's image value accordingly, ensuring imagePullPolicy remains appropriate
for your registry.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 7fb2ef7e-e500-461d-8e35-a9edd5d280ae

📥 Commits

Reviewing files that changed from the base of the PR and between 08675a6 and 2acdb8b.

📒 Files selected for processing (1)
  • deploy/postgres/postgres-reindex.cronjob.yaml.j2

Comment thread deploy/postgres/postgres-reindex.cronjob.yaml.j2 Outdated
Comment thread deploy/postgres/postgres-reindex.cronjob.yaml.j2 Outdated
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 15, 2026

@bluesort: This pull request references MGMT-23553 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Problem

The size and frequency of deletions in the events table results in a large amount of tombstone bloat.

Solution

Update autovacuum settings for the events table for earlier cleaning of tombstones.

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Checked current bloat on the events table and adjusted autovacuum values to clean up before significant performance impact)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Summary by CodeRabbit

  • Chores
  • Database maintenance: added a migration to tune autovacuum settings for the events table, with tests validating apply/rollback behavior.
  • New Features
  • Scheduled maintenance: added a configurable PostgreSQL reindex CronJob to periodically reindex the events table and report per-index and total size metrics, duration, and reclaimed-space statistics.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 16, 2026

@bluesort: This pull request references MGMT-23553 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Problem

The size and frequency of deletions in the events table results in a large amount of tombstone bloat.

Solution

Update autovacuum settings for the events table for earlier cleaning of tombstones.

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Checked current bloat on the events table and adjusted autovacuum values to clean up before significant performance impact)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Summary by CodeRabbit

  • Chores
  • Database maintenance: added a reversible migration to tune autovacuum settings for the events table, with tests validating apply/rollback behavior.
  • New Features
  • Scheduled maintenance: added a configurable PostgreSQL reindex CronJob to periodically reindex the events table and report per-index and total size metrics, duration, and reclaimed-space statistics.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 16, 2026

@bluesort: This pull request references MGMT-23553 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Problem

The events table is high-churn with frequent inserts and deletes, resulting in large amounts of tombstone bloat and impacting performance.

Solution

Configure aggressive autovacuum settings and add a reindex cronjob to periodically reclaim index bloat.

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Checked current bloat on the events table and adjusted autovacuum values to clean up before significant performance impact)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Summary by CodeRabbit

  • Chores
  • Database maintenance: added a reversible migration to tune autovacuum settings for the events table, with tests validating apply/rollback behavior.
  • New Features
  • Scheduled maintenance: added a configurable PostgreSQL reindex CronJob to periodically reindex the events table and report per-index and total size metrics, duration, and reclaimed-space statistics.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@bluesort bluesort changed the title MGMT-23553: Set events autovacuum settings MGMT-23553: Automate events table bloat cleanup Apr 16, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 16, 2026

@bluesort: This pull request references MGMT-23553 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Problem

The events table is high-churn with frequent inserts and deletes, resulting in large amounts of tombstones and index bloat, impacting performance.

Solution

Configure aggressive autovacuum settings and add a reindex cronjob to periodically reclaim index bloat.

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Checked current bloat on the events table and adjusted autovacuum values to clean up before significant performance impact)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Summary by CodeRabbit

  • Chores
  • Database maintenance: added a reversible migration to tune autovacuum settings for the events table, with tests validating apply/rollback behavior.
  • New Features
  • Scheduled maintenance: added a configurable PostgreSQL reindex CronJob to periodically reindex the events table and report per-index and total size metrics, duration, and reclaimed-space statistics.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 16, 2026

@bluesort: This pull request references MGMT-23553 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Problem

The events table is high-churn with frequent inserts and deletes, resulting in lingering tombstones and index bloat, impacting query performance.

Solution

Configure aggressive autovacuum settings and add a reindex cronjob to periodically reclaim index bloat.

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Checked current bloat on the events table and adjusted autovacuum values to clean up before significant performance impact)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Summary by CodeRabbit

  • Chores
  • Database maintenance: added a reversible migration to tune autovacuum settings for the events table, with tests validating apply/rollback behavior.
  • New Features
  • Scheduled maintenance: added a configurable PostgreSQL reindex CronJob to periodically reindex the events table and report per-index and total size metrics, duration, and reclaimed-space statistics.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 16, 2026

@bluesort: This pull request references MGMT-23553 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Problem

The events table is high-churn with frequent inserts and deletes, resulting in lingering tombstones and index bloat and impacting query performance.

Solution

Configure aggressive autovacuum settings and add a reindex cronjob to periodically reclaim index bloat.

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Checked current bloat on the events table and adjusted autovacuum values to clean up before significant performance impact)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Summary by CodeRabbit

  • Chores
  • Database maintenance: added a reversible migration to tune autovacuum settings for the events table, with tests validating apply/rollback behavior.
  • New Features
  • Scheduled maintenance: added a configurable PostgreSQL reindex CronJob to periodically reindex the events table and report per-index and total size metrics, duration, and reclaimed-space statistics.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@bluesort
Copy link
Copy Markdown
Member Author

/test ?

@bluesort
Copy link
Copy Markdown
Member Author

/test all

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 16, 2026

Codecov Report

❌ Patch coverage is 55.55556% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 44.33%. Comparing base (afdca2f) to head (5ae020d).
⚠️ Report is 77 commits behind head on master.

Files with missing lines Patch % Lines
...s/20260413120000_set_events_autovacuum_settings.go 52.94% 4 Missing and 4 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #10140      +/-   ##
==========================================
+ Coverage   44.13%   44.33%   +0.19%     
==========================================
  Files         415      416       +1     
  Lines       72319    72783     +464     
==========================================
+ Hits        31918    32268     +350     
- Misses      37512    37596      +84     
- Partials     2889     2919      +30     
Files with missing lines Coverage Δ
internal/migrations/migrations.go 94.73% <100.00%> (+0.14%) ⬆️
...s/20260413120000_set_events_autovacuum_settings.go 52.94% <52.94%> (ø)

... and 27 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bluesort
Copy link
Copy Markdown
Member Author

/override ci/prow/edge-e2e-metal-assisted-5-0

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Apr 17, 2026

@bluesort: Overrode contexts on behalf of bluesort: ci/prow/edge-e2e-metal-assisted-5-0

Details

In response to this:

/override ci/prow/edge-e2e-metal-assisted-5-0

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@bluesort bluesort marked this pull request as ready for review April 17, 2026 13:09
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 17, 2026
@bluesort
Copy link
Copy Markdown
Member Author

/test edge-e2e-ai-operator-ztp
/override ci/prow/edge-e2e-metal-assisted-5-0

@bluesort
Copy link
Copy Markdown
Member Author

/test edge-e2e-metal-assisted-5-0

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Apr 17, 2026

@bluesort: Overrode contexts on behalf of bluesort: ci/prow/edge-e2e-metal-assisted-5-0

Details

In response to this:

/test edge-e2e-ai-operator-ztp
/override ci/prow/edge-e2e-metal-assisted-5-0

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@bluesort
Copy link
Copy Markdown
Member Author

/override ci/prow/edge-e2e-metal-assisted-5-0

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Apr 17, 2026

@bluesort: Overrode contexts on behalf of bluesort: ci/prow/edge-e2e-metal-assisted-5-0

Details

In response to this:

/override ci/prow/edge-e2e-metal-assisted-5-0

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Apr 17, 2026

@bluesort: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants