Skip to content

readPreference=secondaryPreferred breaks GET /cve use cases #1799

@ElectricNroff

Description

@ElectricNroff

return `mongodb://${dbLoginPrepend}${dbHost}:${dbPort}/${dbName}?replicaSet=rs0&readPreference=secondaryPreferred&retryWrites=false`

Use of the GET /cve endpoint by Secretariat clients has been designed to rely on:

  • time_modified.gt
  • there is a series of uses of this endpoint, and each item in the series follows the pagination instructions (e.g., after the first call, retrieve page=2, page=3, etc. if they exist)
  • the time_modified.gt value for each item in the series is equal to (or slightly before) the start time of the previous item in the series

Under these conditions, and with the old db.js without secondaryPreferred, it was not possible for any CVE Records to be skipped. This was because the result set for time_modified.gt can only grow if there is ongoing write access to the Cve collection during pagination (i.e., documents can enter the matching set but never leave it). If a new document enters at a sort position before the current page offset, everything is shifted to the right by one. There are no circumstances in which a document can be shifted to the left (i.e., become a member of a page that has already been retrieved).

However, with readPreference=secondaryPreferred, this is no longer true and use of the GET /cve endpoint has now become unreliable. A specific API call, such as the one for page=N, can now go to a secondary that has fewer documents than the database node that was used for the page=N-1 call. In other words, because of replication lag, it is now possible that, from the perspective of the API caller, documents leave the matching set. In this case, a CVE Record can be permanently skipped (it is not present on any page during one item in the series, and is also not picked up in a subsequent item that has a later time_modified.gt value).

A CVE Record is also permanently skipped if the page=N-1 API call goes to a secondary that has fewer documents than the database node used for the page=N call. In other words, the action of responding to the page=N API call has no way of knowing that the response to the page=N-1 API call was missing anything.

In addition, a CVE Record is permanently skipped if a secondary, because it has fewer documents, does not set the nextPage property in its last response.

Or, more generally, offset-based pagination only works correctly when there is a single source of truth. It does not work in an "eventually consistent" scenario.

(Even if GET /cve_cursor were used, there still needs to be a single source of truth. It has one fewer failure mode than GET /cve but is still, for example, affected by a case where the first API call goes to a secondary that has fewer documents, or the last API call goes to a secondary with fewer documents such that nextPage is not set.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Needs Triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions