Skip to content

Improve queries with in bulk#3477

Open
cp-at-mit wants to merge 10 commits intomainfrom
improve-queries-with-in-bulk
Open

Improve queries with in bulk#3477
cp-at-mit wants to merge 10 commits intomainfrom
improve-queries-with-in-bulk

Conversation

@cp-at-mit
Copy link
Copy Markdown
Contributor

@cp-at-mit cp-at-mit commented Apr 8, 2026

What are the relevant tickets?

NA

Description (What does it do?)

These are a few spots where we are performing database queries within loops such as:
for id in id_array:
Users.objects.get(id)

This PR updates this code to use in_bulk to get all of the objects by ID before looping through to interact with the objects. Hopefully this reduces the number of database queries in places.

How can this be tested?

Tests should pass, and functionality where the changes were made should not be impacted.

Replace repeated per-item queries with Model.objects.in_bulk across multiple modules to reduce DB hits and avoid exceptions when related objects are missing.

- courses migration: backfill_paidcourserun now collects CourseRun IDs and loads them with in_bulk, skipping missing runs before creating PaidCourseRun entries.
- ecommerce serializers: OrderHistorySerializer (two versions) now bulk-loads Product records and looks them up by id, reducing queries and guarding against missing products.
- hubspot_sync API: make_*_update_message_list_from_* functions use in_bulk for Users, Orders, Lines, and Products and skip missing records when building request payloads.
- users tests: retire_users_test updated to load users via in_bulk (using openedx_users__edx_username as field_name) and assert presence before checks.

These changes improve performance and robustness by batching DB access and handling absent referenced objects gracefully.
Replace per-iteration DB queries with in_bulk lookups to reduce database hits and improve performance. In create_local_enrollments, load CourseRun objects via in_bulk by courseware_id and use the map instead of CourseRun.filter(...).first(); skip missing runs with a logged error. In hubspot_sync.tasks, load Orders via in_bulk and use the map instead of querying Order.objects.get(...) on each loop iteration; skip missing orders to avoid exceptions. These changes reduce N+1 queries and add safer handling for missing records.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 8, 2026

OpenAPI Changes

Show/hide ## Changes for v0.yaml:
## Changes for v0.yaml:
No changes detected

## Changes for v1.yaml:
No changes detected

## Changes for v2.yaml:
No changes detected

Unexpected changes? Ensure your branch is up-to-date with main (consider rebasing).

@cp-at-mit cp-at-mit marked this pull request as ready for review April 8, 2026 18:07
Comment thread hubspot_sync/tasks.py
Comment on lines +505 to +507
deal = deals_by_id.get(order_id)
if not deal:
continue
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: If the last order in a batch is missing from the database, accumulated association data is silently dropped and never sent to HubSpot due to a skipped final flush.
Severity: HIGH

Suggested Fix

Decouple the final flush logic from the item processing loop. After the for loop concludes, add a final, unconditional check to flush any remaining items in contact_associations_batch and line_associations_batch. This ensures all accumulated data is sent, regardless of whether the last item was processed.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: hubspot_sync/tasks.py#L505-L507

Potential issue: In the `batch_upsert_associations_chunked` function, data is processed
in batches and flushed to HubSpot when the batch is full or on the last item. The code
checks for the last item using `idx == deal_count - 1`. However, if the last `order_id`
in the input list does not correspond to an existing `Order` in the database, a
`continue` statement is executed, skipping the rest of the loop body. This prevents the
final flush condition from ever being evaluated, causing any accumulated data in
`contact_associations_batch` and `line_associations_batch` to be silently dropped
instead of being sent to HubSpot.

Did we get this right? 👍 / 👎 to inform future reviews.

@cp-at-mit cp-at-mit force-pushed the improve-queries-with-in-bulk branch from fbbdb4e to e629003 Compare April 10, 2026 18:35
return transaction


class TransactionPurchaseSerializer(TransactionDataSerializer):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The code calls .data on the result of super().to_representation(instance), which can be None if an order has no transactions, causing an AttributeError.
Severity: HIGH

Suggested Fix

Before accessing the .data attribute on the transaction object, add a check to ensure it is not None. This handles cases where an order may not have any associated transactions.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: ecommerce/serializers/__init__.py#L725

Potential issue: In `TransactionPurchaseSerializer` and
`TransactionPurchaserSerializer`, the `to_representation` method calls
`super().to_representation(instance).data`. However, the parent method,
`TransactionDataSerializer.to_representation`, can return `None` if an order does not
have any associated transactions (e.g., for free courses, failed payments, or pending
orders). Calling `.data` on a `None` value will raise an `AttributeError`, causing a
crash. This will prevent users from viewing their order receipt for any order that lacks
a transaction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant