forked from aep-dev/aeps
-
Notifications
You must be signed in to change notification settings - Fork 1
Offset and cursor pagination #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
kalexieva
wants to merge
2
commits into
main
Choose a base branch
from
pagination
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -10,12 +10,41 @@ be paginated. | |
| * Endpoints returning collections of data **must** be paginated. | ||
| * APIs **should** prefer [cursor-based pagination](#cursor-based-pagination) | ||
| to [offset-based pagination](#token-based-offset-pagination). | ||
| * If using offset-based pagination, _new_ APIs **must** | ||
| implement [token-based offset pagination](#token-based-offset-pagination). | ||
| See [Choosing a pagination strategy](#choosing-a-pagination-strategy). | ||
| * Query parameters for pagination **must** follow the guidelines in AEP-106. | ||
| * The array of resources **must** be named `results` and contain resources with | ||
| no additional wrapping. | ||
|
|
||
| ### Choosing a pagination strategy | ||
|
|
||
| **Note:** Many technical constraints trace back to database design decisions made long before an API is built. A schema | ||
| that lacks stable sort keys, proper indexing, or a well-chosen primary key will make cursor pagination difficult and | ||
| offset pagination unreliable. How you design your database is important. A well-designed schema keeps both pagination | ||
| strategies on the table, while a poor one may take options off the table permanently. | ||
|
|
||
| This decision is not purely a UX decision, nor is it purely a technical one. UX requirements are a valid and important | ||
| input, but **must** be weighed against dataset characteristics and performance rather than treated as the deciding | ||
| factor in isolation. A great UX with offset pagination is of no use if the underlying dataset cannot support it | ||
| reliably. Before choosing offset, teams **must** evaluate both user experience _and_ technical limitations. | ||
|
|
||
| Use cursor-based pagination when: | ||
|
|
||
| - The dataset is large, unbounded, or expected to grow significantly over time. | ||
| - The underlying database is NoSQL or sharded, where offset scanning is expensive or unreliable. | ||
| - The data changes frequently, as offset pagination may produce duplicates or skip items between page requests. | ||
| - Sequential traversal (next/previous) is enough for the use case. | ||
|
|
||
| Use offset-based pagination when: | ||
|
|
||
| - The dataset is small, bounded, and unlikely to grow significantly. | ||
| - The underlying database is relational and the paginated query can be efficiently indexed. | ||
| - The data is stable and unlikely to change between page requests. | ||
| - Users must be able to jump to an arbitrary page, this is a validated user need and not just an assumed one. | ||
| - UX requirements genuinely call for it, and the above technical factors do not contradict it. | ||
|
|
||
| **Note:** Cursor is also the safer default: switching from cursor to offset later is straightforward, but the reverse | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice:) |
||
| will ruin your week. | ||
|
|
||
| ### Cursor-based pagination | ||
|
|
||
| Cursor-based pagination uses a `pageToken` which is an opaque pointer to a page that must never be inspected or | ||
|
|
@@ -69,21 +98,6 @@ responds with: | |
| } | ||
| ``` | ||
|
|
||
| ### Token-based offset pagination | ||
|
|
||
| APIs that have a legitimate need for offset-based pagination **should** use token-based offset pagination. This approach | ||
| encodes the offset, limit, and any filters into an opaque token. | ||
|
|
||
| When implementing token-based offset pagination: | ||
|
|
||
| * The API **must** use the same request/response structure as [cursor-based pagination](#cursor-based-pagination) ( | ||
| `pageSize`, `pageToken` and `nextPageToken`) | ||
| * The page token **must** internally encode the offset, limit, and query parameters | ||
| * The implementation details **must** be hidden from the client | ||
|
|
||
| From the client's perspective, this is identical to cursor-based pagination. The difference is only in the server-side | ||
| implementation. | ||
|
|
||
| ### Page Token Opacity | ||
|
|
||
| Page tokens provided by APIs **must** be opaque (but URL-safe) strings, and **must not** be user-parseable. This is | ||
|
|
@@ -109,42 +123,83 @@ used. It is not necessary to document this behavior. | |
| **Note:** While a reasonable time may vary between APIs, a good rule of thumb | ||
| is three days. | ||
|
|
||
| ### Small collections | ||
| ### Offset-based pagination | ||
|
|
||
| When implementing offset-based pagination: | ||
|
|
||
| * Request schemas for collections **must** define an integer `pageNumber` query parameter, allowing users to specify | ||
| which page of results to return. | ||
| * The `pageNumber` field **must not** be required and **must** default to `1`. | ||
| * Request schemas for collections **must** define an integer `pageSize` query parameter, allowing users to specify the | ||
| maximum number of results to return. | ||
| * The `pageSize` field **must not** be required. | ||
| * If the request does not specify `pageSize`, the API **must** choose an appropriate default. | ||
| * Response messages **may** include a `total` field indicating the total number of results available, though this | ||
| **should** be avoided if the calculation is expensive. | ||
| * The API **may** return fewer results than the number requested (including zero results), even if not at the end of the | ||
| collection. | ||
|
|
||
| Example: | ||
|
|
||
| ```http request | ||
| GET /v1/publishers/123/books?pageSize=50&pageNumber=2 | ||
| ``` | ||
|
|
||
| responds with: | ||
|
|
||
| ```json | ||
| { | ||
| "results": [ | ||
| { | ||
| "id": "456", | ||
| "title": "Les Misérables", | ||
| "author": "Victor Hugo" | ||
| } | ||
| // ... 49 more books | ||
| ], | ||
| "total": 342 | ||
| } | ||
| ``` | ||
|
|
||
| All collections **must** return a paginated response structure, regardless of size. | ||
| ### Small Collections | ||
|
|
||
| For collections that are known to be small (subject to interpretation, but typically fewer than 1000 items), endpoints * | ||
| *should** implement true pagination. That way, if the collection grows beyond the expected size in the future, | ||
| pagination is already in place. | ||
| All collections **must** return a paginated response structure, regardless of | ||
| size. For collections that will never meaningfully benefit from pagination, | ||
| endpoints **may** satisfy this requirement by returning all results in a single | ||
| response with an empty or absent `nextPageToken`, without implementing actual | ||
| pagination logic. In other words, just wrap the results in the pagination envelope | ||
| without actually implementing pagination. | ||
|
|
||
| However, if the collection is small enough that it doesn't benefit from true pagination, endpoints **may** return all | ||
| results in a single page with an empty `nextPageToken`, without implementing actual pagination logic. | ||
| However, if there is any reasonable chance the collection grows beyond a small | ||
| size (typically a few hundred to low thousands of items), endpoints **should** | ||
| implement true pagination from the start. Retrofitting pagination onto a | ||
| collection that clients already consume as a single page is a breaking change. | ||
|
|
||
| ### Traditional offset-based pagination | ||
| ## Interface Definitions | ||
|
|
||
| **Important:** _New_ APIs **must not** use traditional offset-based pagination. If offset-based pagination is required, | ||
| _new_ APIs **must** use [token-based offset pagination](#token-based-offset-pagination) instead. | ||
| ### Cursor Pagination | ||
|
|
||
| This section documents traditional offset-based pagination for backwards compatibility with existing APIs. Migration to | ||
| a different pagination strategy is highly encouraged, although not required (_yet_). | ||
| {% tab proto %} | ||
|
|
||
| When implementing traditional offset-based pagination (existing APIs only): | ||
| {% tab oas %} | ||
|
|
||
| * Request schemas for collections **must** define an integer `offset` query parameter, allowing users to specify the | ||
| number of results to skip before returning results. | ||
| * The `offset` field **must not** be required and **must** default to `0`. | ||
| * Request schemas for collections **must** define an integer `limit` query parameter, allowing users to specify the | ||
| maximum number of results to return. | ||
| * The `limit` field **must not** be required. | ||
| * If the request does not specify `limit`, the API **must** choose an appropriate default. | ||
| * Response messages **may** include a `total` field indicating the total number of results available, though this | ||
| **should** be avoided if the calculation is expensive. | ||
| * The API **may** return fewer results than the number requested (including zero results), even if not at the end of the | ||
| collection. | ||
| {% sample 'cursor.oas.yaml', '$.paths./publishers/{publisher_id}/books.get' %} | ||
|
|
||
| {% endtabs %} | ||
|
|
||
| ### Offset Pagination | ||
|
|
||
| {% tab proto %} | ||
|
|
||
| {% tab oas %} | ||
|
|
||
| {% sample 'offset.oas.yaml', '$.paths./publishers/{publisher_id}/books.get' %} | ||
|
|
||
| {% endtabs %} | ||
|
|
||
| ## Rationale | ||
|
|
||
| ### Cursor-based pagination | ||
| ### Preferring cursor over offset | ||
|
|
||
| Cursor-based pagination is generally better and more efficient than offset-based pagination. Cursor-based pagination | ||
| maintains consistent performance regardless of dataset size, while offset-based pagination degrades as offsets increase. | ||
|
|
@@ -153,26 +208,13 @@ even when data changes between requests, preventing items from being skipped or | |
| collections that are frequently updated. These advantages make cursor-based pagination the preferred approach for _most_ | ||
| use cases. | ||
|
|
||
| ### Token offset vs offset pagination | ||
|
|
||
| Using tokens makes the API flexible for the future. It allows switching to cursor-based pagination internally without | ||
| breaking the API contract. All paginated endpoints (cursor and offset) work the same way from the client's perspective. | ||
| Tokenizing offset pagination prevents users from manipulating offsets arbitrarily to access data in unintended ways. And | ||
| it also prevents users from changing filters/sorts mid-pagination, which can cause inconsistent results. These benefits | ||
| make token-based offset pagination better than traditional offset-based pagination. | ||
|
|
||
| ### Avoid traditional offset-based pagination | ||
|
|
||
| Traditional offset-based pagination has several significant limitations. Performance degrades with large offsets, as the | ||
| database must skip many rows before returning results. Results can be inconsistent if data changes between requests; | ||
| items may be skipped or duplicated as users page through results. It is not suitable for real-time data or frequently | ||
| updated collections. Also, users can manipulate offsets arbitrarily to access data in potentially unintended ways. These | ||
| limitations are why new APIs must use either cursor-based pagination or token-based offset pagination instead. | ||
|
|
||
| ## Changelog | ||
|
|
||
| * **2026-02-23**: Change guidance to allow both offset and cursor. Remove the token offset option. Add guidance on when | ||
| to choose each method. | ||
| * **2026-01-30**: Enforce `camelCase`, not `snake_case` for query parameters | ||
| * **2025-12-15**: Added guidance on token-based offset pagination for new APIs, small collection handling, and clarified that new APIs must use cursor-based or token-based offset pagination only. | ||
| * **2025-12-15**: Added guidance on token-based offset pagination for new APIs, small collection handling, and clarified | ||
| that new APIs must use cursor-based or token-based offset pagination only. | ||
| * **2025-12-10**: Initial creation, adapted from [Google AIP-158][] and aep.dev [AEP-158][]. | ||
|
|
||
| [Google AIP-158]: https://google.aip.dev/158 | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,80 @@ | ||
| openapi: 3.1.0 | ||
| info: | ||
| title: Books API | ||
| version: 1.0.0 | ||
| paths: | ||
| /publishers/{publisher_id}/books: | ||
| get: | ||
| summary: List books for a publisher | ||
| operationId: listBooks | ||
| parameters: | ||
| - name: publisher_id | ||
| in: path | ||
| required: true | ||
| schema: | ||
| type: string | ||
| - name: pageSize | ||
| in: query | ||
| required: false | ||
| schema: | ||
| type: integer | ||
| default: 20 | ||
| example: 50 | ||
| description: > | ||
| The maximum number of results to return. The API will choose an | ||
| appropriate default if not specified. The API may return fewer | ||
| results than requested, even if not at the end of the collection. | ||
| - name: pageToken | ||
| in: query | ||
| required: false | ||
| schema: | ||
| type: string | ||
| example: abc123xyz | ||
| description: > | ||
| An opaque, URL-safe token used to advance to the next page of | ||
| results. This value is obtained from the `nextPageToken` field of | ||
| a previous response. Must not be inspected or constructed by | ||
| clients. | ||
| responses: | ||
| '200': | ||
| description: A paginated list of books. | ||
| content: | ||
| application/json: | ||
| schema: | ||
| type: object | ||
| properties: | ||
| results: | ||
| type: array | ||
| items: | ||
| $ref: '#/components/schemas/Book' | ||
| description: > | ||
| The list of books for the current page. | ||
| nextPageToken: | ||
| type: string | ||
| description: > | ||
| An opaque token used to retrieve the next page of results. If | ||
| absent or empty, the end of the collection has been reached. Pass | ||
| this value as the `pageToken` query parameter in a subsequent | ||
| request to retrieve the next page. | ||
| components: | ||
| schemas: | ||
| Book: | ||
| description: A representation of a single book. | ||
| properties: | ||
| name: | ||
| type: string | ||
| description: | | ||
| The name of the book. | ||
| Format: publishers/{publisher_id}/books/{book_id} | ||
| isbn: | ||
| type: string | ||
| description: | | ||
| The ISBN (International Standard Book Number) for this book. | ||
| title: | ||
| type: string | ||
| description: The title of the book. | ||
| authors: | ||
| type: array | ||
| items: | ||
| type: string | ||
| description: The author or authors of the book. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,77 @@ | ||
| openapi: 3.1.0 | ||
| info: | ||
| title: Books API | ||
| version: 1.0.0 | ||
| paths: | ||
| /publishers/{publisher_id}/books: | ||
| get: | ||
| summary: List books for a publisher | ||
| operationId: listBooks | ||
| parameters: | ||
| - name: publisher_id | ||
| in: path | ||
| required: true | ||
| schema: | ||
| type: string | ||
| - name: pageSize | ||
| in: query | ||
| required: false | ||
| schema: | ||
| type: integer | ||
| default: 20 | ||
| example: 50 | ||
| description: > | ||
| The maximum number of results to return. The API will choose an | ||
| appropriate default if not specified. The API may return fewer | ||
| results than requested, even if not at the end of the collection. | ||
| - name: pageNumber | ||
| in: query | ||
| required: false | ||
| schema: | ||
| type: integer | ||
| default: 1 | ||
| example: 1 | ||
| description: > | ||
| The page number to return, 1-indexed. Defaults to 1 if not | ||
| specified. | ||
| responses: | ||
| '200': | ||
| description: A paginated list of books. | ||
| content: | ||
| application/json: | ||
| schema: | ||
| type: object | ||
| properties: | ||
| results: | ||
| type: array | ||
| items: | ||
| $ref: '#/components/schemas/Book' | ||
| description: > | ||
| The list of books for the current page. | ||
| total: | ||
| type: integer | ||
| description: > | ||
| The total number of results available. This field is optional and | ||
| may not always be present, as calculating it can be expensive. | ||
| components: | ||
| schemas: | ||
| Book: | ||
| description: A representation of a single book. | ||
| properties: | ||
| name: | ||
| type: string | ||
| description: | | ||
| The name of the book. | ||
| Format: publishers/{publisher_id}/books/{book_id} | ||
| isbn: | ||
| type: string | ||
| description: | | ||
| The ISBN (International Standard Book Number) for this book. | ||
| title: | ||
| type: string | ||
| description: The title of the book. | ||
| authors: | ||
| type: array | ||
| items: | ||
| type: string | ||
| description: The author or authors of the book. |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it worth adding somewhere (either to this bullet or elsewhere) that sometimes databases support cursors natively (i.e. Datastore)?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just about every modern database supports cursor-based pagination natively today. Saying something like "only some databases natively support cursors" implies others don't, which would be misleading.
But I think I understand what you are trying to get at though. It's more about NoSQL and sharded databases tend to make offset pagination actively painful or explicitly don't support it?
So how about something more like this?
- The underlying database is NoSQL or sharded. These databases are often designed around cursor-style access and may not support offset scanning at all or do so at significant performance cost.