Skip to content

Change how the total ordering of Orders is determined#196

Open
ldionne wants to merge 9 commits intollvm:mainfrom
ldionne:review/improve-orders
Open

Change how the total ordering of Orders is determined#196
ldionne wants to merge 9 commits intollvm:mainfrom
ldionne:review/improve-orders

Conversation

@ldionne
Copy link
Copy Markdown
Member

@ldionne ldionne commented Mar 24, 2026

Prior to this patch, orders would be stored in the database as a doubly linked list, each order knowing its previous and its next order. The position of each order in that list was determined by the convert_revision() function being run on the field of the schema marked as an order.

This is awkward because this organization made it extremely difficult to e.g. sort orders in "chronological" order. Doing so required querying all the orders and then performing a sort. This is especially painful in an upcoming query API I'd like to add where data can be extracted from the server using complex query fields, including orders. For the implementation to be reasonable, we need an ordinal column representing the position of the order within the order-space.

This also removes a few places where llvm_project_revision was hardcoded. Some places still remain, but this is chipping away at things.

Key changes:

  • Migration (upgrade_18_to_19): adds ordinal column, populates it from convert_revision()-sorted orders, drops PreviousOrder/NextOrder columns
  • Order model: Orders are now always created with an ordinal, and Order comparison is based on that ordinal. The insertion logic ensures that the ordinal is correct and stays so on subsequent updates
  • Query sites (regression.py, views.py, api.py) use ORDER BY ordinal instead of Python sorted() or string ORDER BY
  • Templates use ordinal-based prev/next navigation instead of the linked list
  • New OrdinalAssignment.shtest covers insertion edge cases: first order, append, prepend with shift, middle insertion, and duplicate detection

Assisted-by: Claude

@ldionne
Copy link
Copy Markdown
Member Author

ldionne commented Mar 24, 2026

Since this one is a larger and more architectural change, I'd like to have someone's eyes on this. @lukel97 are you around?

Basically, the gist of this change is that instead of relying on a "linked list" structure to store orders and order them relative to each other, we add a column that contains a number representing each order's absolute position in the total-order. The benefit is that sorting in a bunch of places becomes easier, and we can constrain the hack around reading .llvm_project_revision to a smaller scope -- although a few uses are left.

In the future, I would suggest that we modify how orders are submitted so that all submissions include an increasing number representing the ordinal being added here: that would remove any hardcoded notion of llvm_project_revision from the input format.

This patch also unblocks work I'm doing on the side to rewrite a complete API (v5) to provide an OpenAPI specification and be a lot friendlier to AI agents, which I believe can be leveraged to be a game changer in regression detection. However, that is only experimental at the moment and will be part of a RFC since it's a major change.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR changes how test suite Order records are totally ordered by introducing an explicit integer ordinal column (instead of a linked-list via PreviousOrder/NextOrder), and updates server queries/templates/tests to use ordinal ordering for DB-side sorting and navigation.

Changes:

  • Add DB migration upgrade_18_to_19 to backfill ordinal and remove linked-list columns.
  • Update Order model comparison/insertion logic and multiple query sites to order by ordinal.
  • Add/adjust server DB tests and UI templates for ordinal-based prev/next and listing.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/server/db/OrdinalAssignment.shtest New lit test covering ordinal assignment and insertion edge cases.
tests/server/db/ImportV4TestSuiteInstance.shtest Update assertions/queries to validate ordinal ordering instead of linked-list pointers.
tests/server/db/CreateV4TestSuiteInstance.py Ensure test instance creation assigns ordinals and updates related assertions.
lnt/server/ui/views.py Switch order navigation/listing and several data query paths to ordinal-based ordering.
lnt/server/ui/templates/v4_order.html Use previous_order/next_order objects instead of linked-list IDs.
lnt/server/ui/templates/v4_all_orders.html Display ordinal instead of previous/next IDs.
lnt/server/ui/api.py Update graph API endpoint ordering to use Order.ordinal.
lnt/server/db/testsuitedb.py Introduce ordinal column, total ordering, and ordinal-based insertion/closest-run logic.
lnt/server/db/regression.py Order regression queries by Order.ordinal.
lnt/server/db/migrations/upgrade_18_to_19.py Migration to add/backfill ordinal, drop linked-list columns, and add an ordinal index.

Comment on lines 1925 to 1934
q = session.query(req.field.column, ts.Order.llvm_project_revision,
ts.Order.id) \
.select_from(ts.Sample) \
.join(ts.Run) \
.join(ts.Order) \
.filter(ts.Run.machine_id == req.machine.id) \
.filter(ts.Sample.test == req.test) \
.filter(req.field.column.isnot(None)) \
.order_by(ts.Order.llvm_project_revision.desc())
.order_by(ts.Order.ordinal.desc())

Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description says all query sites in views.py switched to ORDER BY ordinal instead of Python sorting. In v4_matrix, the query is now ordered by Order.ordinal, but the view ultimately collects revisions into a set and later does all_orders.sort(reverse=True) on the revision strings, so the rendered order is still lexicographic/non-ordinal (and order_by here has no effect). Consider keeping Orders (or (revision, ordinal) pairs) and sorting by ordinal for display, to match the PR’s stated behavior.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

@lukel97 lukel97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me, I don't think the doubly linked structure scales well

My main qualm: Instead of storing an integer for the ordinal can we store some sort of type compatible with the tuple that convert_revision produces, so we can just insert it directly and order directly on it. That way we don't have to update other rows when we insert a new one.

Maybe something like using an array of TEXT, with a GIN index. Is this something claude can explore

#
# Even though we already know the right order, this is faster than
# issueing separate queries.
runs.sort(key=lambda r: r.order, reverse=(direction == -1))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still needed now that the results are sorted?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, since runs is:

runs = session.query(self.Run).\
            filter(self.Run.machine == run.machine).\
            filter(self.Run.order_id.in_(ids_to_fetch)).all()

which is not sorted. We could order runs by ordinal in the db query, but that requires joining back to Order, and the comment right above seems to say that it's faster to sort in Python. Also, the number of runs we're sorting is small since it's basically N where N is the number of adjacent runs we're extracting.

Comment on lines +941 to +942
existing_orders[i].ordinal = i + 1
order.ordinal = insert_at
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this transaction get committed? Do you have any debug output of what SQL it generates when a new order is created.

Two things I can think of we need to check:

  1. This is indeed one single transaction, not one UPDATE per row
  2. The SQL generated is something like UPDATE orders SET ordinal = ordinal + 1, not UPDATE orders SET ordinal = $1 where id = $2 where $1 and $2 are pairs of ids and new ordinals plumbed through

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The transaction gets committed by whoever calls _getOrCreateOrder (via _getOrCreateRun).

You're right about the UPDATE here though, this generates a bunch of updates instead of a single one. Switching to:

# Shift all ordinals at or after the insertion point up by 1.
session.query(self.Order) \
        .filter(self.Order.id != order.id) \
        .filter(self.Order.ordinal >= insert_at) \
        .update({self.Order.ordinal: self.Order.ordinal + 1},
                synchronize_session=False)

ldionne added 3 commits March 25, 2026 21:01
Prior to this patch, orders would be stored in the database as a
doubly linked list, each order knowing its previous and its next
order. The position of each order in that list was determined by
the convert_revision() function being run on the field of the schema
marked as an order.

This is awkward because this organization made it extremely difficult
to e.g. sort orders in "chronological" order. Doing so required querying
all the orders and then performing a sort. This is especially painful
in an upcoming query API I'd like to add where data can be extracted
from the server using complex query fields, including orders. For the
implementation to be reasonable, we need an ordinal column representing
the position of the order within the order-space.

This also removes a few places where llvm_project_revision was hardcoded.
Some places still remain, but this is chipping away at things.

Key changes:
- Migration (upgrade_18_to_19): adds ordinal column, populates it from
  convert_revision()-sorted orders, drops PreviousOrder/NextOrder columns
- Order model: ordinal-based comparison via @total_ordering, field-based
  comparison only during insertion, flush() instead of commit() for
  transactional safety, WITH FOR UPDATE for concurrency
- All query sites (regression.py, views.py, api.py) use
  ORDER BY ordinal instead of Python sorted() or string ORDER BY
- Templates use ordinal-based prev/next navigation
- New OrdinalAssignment.shtest covers insertion edge cases: first order,
  append, prepend with shift, middle insertion, and duplicate detection

Assisted-by: Claude
would be possible to support that, but it's a lot more complicated).
#
# Even though we already know the right order, this is faster than
# issueing separate queries.
runs.sort(key=lambda r: r.order, reverse=(direction == -1))
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, since runs is:

runs = session.query(self.Run).\
            filter(self.Run.machine == run.machine).\
            filter(self.Run.order_id.in_(ids_to_fetch)).all()

which is not sorted. We could order runs by ordinal in the db query, but that requires joining back to Order, and the comment right above seems to say that it's faster to sort in Python. Also, the number of runs we're sorting is small since it's basically N where N is the number of adjacent runs we're extracting.

Comment on lines +941 to +942
existing_orders[i].ordinal = i + 1
order.ordinal = insert_at
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The transaction gets committed by whoever calls _getOrCreateOrder (via _getOrCreateRun).

You're right about the UPDATE here though, this generates a bunch of updates instead of a single one. Switching to:

# Shift all ordinals at or after the insertion point up by 1.
session.query(self.Order) \
        .filter(self.Order.id != order.id) \
        .filter(self.Order.ordinal >= insert_at) \
        .update({self.Order.ordinal: self.Order.ordinal + 1},
                synchronize_session=False)

@ldionne ldionne force-pushed the review/improve-orders branch from c61e3da to 71f3ee2 Compare March 26, 2026 01:15
ldionne added 2 commits March 31, 2026 08:55
This adds complexity but allows supporting SQLite for local
deployments.

Assisted-by: Claude
@ldionne
Copy link
Copy Markdown
Member Author

ldionne commented Mar 31, 2026

@lukel97 Let me know if you have additional thoughts about this patch. This is one of the stepping stones for the upcoming v5 API (and associated UI redesign, and more!)

@lukel97
Copy link
Copy Markdown
Contributor

lukel97 commented Apr 2, 2026

@lukel97 Let me know if you have additional thoughts about this patch. This is one of the stepping stones for the upcoming v5 API (and associated UI redesign, and more!)

Do you have any designs of the UI redesign? Interested to hear more

@ldionne
Copy link
Copy Markdown
Member Author

ldionne commented Apr 2, 2026

@lukel97 Let me know if you have additional thoughts about this patch. This is one of the stepping stones for the upcoming v5 API (and associated UI redesign, and more!)

Do you have any designs of the UI redesign? Interested to hear more

Yeah, I have a working site and an new OpenAPI API that can be used by agents to investigate regressions, etc. It's still a work in progress but I'm very excited about the prospects. We should probably chat for a demo.

ldionne added 4 commits April 2, 2026 15:01
It's fine to have unreferenced columns and it greatly simplifies the code.
…ciated

This makes it a lot more robust and simple: Orders are created with an
ordinal and that's how they are compared everywhere, period. There is
no complicated fallback in case the ordinal field is None.
Copy link
Copy Markdown
Member Author

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lukel97 I did another pass on this and here's the gist of the changes. I think it's a lot better than what I initially had, thanks for questioning some of the changes I was making:

  1. In the migration, we now handle Postgres and SQLite the same, except we don't drop the old columns for SQLite. That shouldn't matter, it's functionally equivalent and we don't care about db size for short lived local instances
  2. Orders are now created with an ordinal, always. That's the only way to compare Order objects. This simplifies a lot of the code and it simplifies the mental model even more.
  3. On insertion, we do some work to figure out the right ordinal for the newly-inserted order, and we then create the order with the right ordinal from the get go.

I think this is a lot better and clearer than how it started, let me know what you think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants