Conversation
Replace repeated per-item queries with Model.objects.in_bulk across multiple modules to reduce DB hits and avoid exceptions when related objects are missing. - courses migration: backfill_paidcourserun now collects CourseRun IDs and loads them with in_bulk, skipping missing runs before creating PaidCourseRun entries. - ecommerce serializers: OrderHistorySerializer (two versions) now bulk-loads Product records and looks them up by id, reducing queries and guarding against missing products. - hubspot_sync API: make_*_update_message_list_from_* functions use in_bulk for Users, Orders, Lines, and Products and skip missing records when building request payloads. - users tests: retire_users_test updated to load users via in_bulk (using openedx_users__edx_username as field_name) and assert presence before checks. These changes improve performance and robustness by batching DB access and handling absent referenced objects gracefully.
Replace per-iteration DB queries with in_bulk lookups to reduce database hits and improve performance. In create_local_enrollments, load CourseRun objects via in_bulk by courseware_id and use the map instead of CourseRun.filter(...).first(); skip missing runs with a logged error. In hubspot_sync.tasks, load Orders via in_bulk and use the map instead of querying Order.objects.get(...) on each loop iteration; skip missing orders to avoid exceptions. These changes reduce N+1 queries and add safer handling for missing records.
OpenAPI ChangesShow/hide ## Changes for v0.yaml:Unexpected changes? Ensure your branch is up-to-date with |
| deal = deals_by_id.get(order_id) | ||
| if not deal: | ||
| continue |
There was a problem hiding this comment.
Bug: If the last order in a batch is missing from the database, accumulated association data is silently dropped and never sent to HubSpot due to a skipped final flush.
Severity: HIGH
Suggested Fix
Decouple the final flush logic from the item processing loop. After the for loop concludes, add a final, unconditional check to flush any remaining items in contact_associations_batch and line_associations_batch. This ensures all accumulated data is sent, regardless of whether the last item was processed.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: hubspot_sync/tasks.py#L505-L507
Potential issue: In the `batch_upsert_associations_chunked` function, data is processed
in batches and flushed to HubSpot when the batch is full or on the last item. The code
checks for the last item using `idx == deal_count - 1`. However, if the last `order_id`
in the input list does not correspond to an existing `Order` in the database, a
`continue` statement is executed, skipping the rest of the loop body. This prevents the
final flush condition from ever being evaluated, causing any accumulated data in
`contact_associations_batch` and `line_associations_batch` to be silently dropped
instead of being sent to HubSpot.
Did we get this right? 👍 / 👎 to inform future reviews.
fbbdb4e to
e629003
Compare
for more information, see https://pre-commit.ci
…odl/mitxonline into improve-queries-with-in-bulk
for more information, see https://pre-commit.ci
This reverts commit 54eb754.
…odl/mitxonline into improve-queries-with-in-bulk
| return transaction | ||
|
|
||
|
|
||
| class TransactionPurchaseSerializer(TransactionDataSerializer): |
There was a problem hiding this comment.
Bug: The code calls .data on the result of super().to_representation(instance), which can be None if an order has no transactions, causing an AttributeError.
Severity: HIGH
Suggested Fix
Before accessing the .data attribute on the transaction object, add a check to ensure it is not None. This handles cases where an order may not have any associated transactions.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: ecommerce/serializers/__init__.py#L725
Potential issue: In `TransactionPurchaseSerializer` and
`TransactionPurchaserSerializer`, the `to_representation` method calls
`super().to_representation(instance).data`. However, the parent method,
`TransactionDataSerializer.to_representation`, can return `None` if an order does not
have any associated transactions (e.g., for free courses, failed payments, or pending
orders). Calling `.data` on a `None` value will raise an `AttributeError`, causing a
crash. This will prevent users from viewing their order receipt for any order that lacks
a transaction.
What are the relevant tickets?
NA
Description (What does it do?)
These are a few spots where we are performing database queries within loops such as:
for id in id_array:
Users.objects.get(id)
This PR updates this code to use
in_bulkto get all of the objects by ID before looping through to interact with the objects. Hopefully this reduces the number of database queries in places.How can this be tested?
Tests should pass, and functionality where the changes were made should not be impacted.