RFC: DSS Bundle Enumeration by xbrianh · Pull Request #101 · HumanCellAtlas/dcp-community

xbrianh · 2019-08-09T13:58:53Z

Proposed addition to the DSS API to enumerate bundles independently of the DSS Elasticsearch metadata index.

Last call for community review: Sept 24

rfcs/text/0000-dss-bundle-enumeration.md

diekhans · 2019-08-22T00:04:08Z

Since there is no filtering mechanism, other than whatever the prefix does, it seems like the only functional use cases it get a very larger list of all bundles. If might be nice to state that more explicitly if true.

brianraymor · 2019-08-30T01:23:49Z

@kislyuk - not sure why this is tagged rfc-community-review if it's not actually in community review?

diekhans · 2019-09-10T16:44:02Z

rfcs/text/0000-dss-bundle-enumeration.md

+
+## Detailed Design
+
+A new bundle enumeration endpoint, `GET /bundles`, will be introduced, taking replica and prefix parameters. A listing


maybe call this "UUID prefix"

I've updated the text to better reflect the usage of prefix as a "uuid prefix".

diekhans · 2019-09-10T16:47:43Z

rfcs/text/0000-dss-bundle-enumeration.md

+all all bundles having UUIDs beginning with `prefix` will be returned directly from object storage. The prefix
+parameter is intended to facilitate parallelized listings. Pagination semantics and all other semantics of this route
+will be in line with the established conventions of the DSS API.
+


Without some kind of filtering, this seems like a very heavyweight endpoint to use. A couple of things come to mind:

Filter by update date/time. This would allow only obtaining bundles that have changes since the last time the query was run.

Filter by bundle type. Well, first we have to have bundle types.

This endpoint is intended for heavyweight use by downstream indexers.

Also, an incremental approach seems preferable: if filtering becomes desirable in the future, it can be added to the endpoint.

Agreed that this is easy to add downstream. Assuming a full dump is what existing users want, then my speculative use is not a real use case ;-)

@diekhans you raise a good point, but as @xbrianh pointed out, the use case here is an unfiltered bulk pull of all bundle IDs for external indexing. We did look for a way to use a "lightweight" database to do filtering using our established filter language process (JMESpath), but didn't find any suitable database/indexing engine for such a task.

rfcs/text/0000-dss-bundle-enumeration.md

brianraymor · 2019-09-18T01:22:40Z

rfcs/text/0000-dss-bundle-enumeration.md

+
+* As a downstream service developer, I would like to check if my index contains all the bundles in the DSS.
+
+## Detailed Design


I was expecting something a bit more swagger-y rather than a narrative description in Detailed Design for a API.

@brianraymor It's not clear to me how to address your comment. Perhaps you have something in mind similar to the API endpoint descriptions found in the Deletion RFC.

However, those additions will not necessarily improve the clarity and actionability of this document, which defines a simple extension to the DSS API in language that I believe is clear to the developers who will implement it. Do you feel there are technical details missing, or that there is ambiguity that needs to be cleared up?

The Deletion RFC more closely meets my expectations for a RFC as design document. Another approach is how Azure defines their REST APIs. Any reviewer/developer should be able to read this RFC and understand the DSS API in detail - not just the implementers. This currently reads more like what we called one pagers at Microsoft.

samanehsan

Looks good! Should this be re-labeled with oversight-review now that the community-review date has passed?

Bundle Enumeration Draft

9b1f71e

xbrianh added the Architecture label Aug 9, 2019

kislyuk added the rfc-community-review label Aug 13, 2019

Bento007 reviewed Aug 16, 2019

View reviewed changes

rfcs/text/0000-dss-bundle-enumeration.md Outdated Show resolved Hide resolved

xbrianh added Data Store and removed Data Store labels Sep 10, 2019

prefix explained

aebf3d3

xbrianh force-pushed the bhannafi-dss-bundle-enumeration.md branch from 68fa2d5 to aebf3d3 Compare September 10, 2019 16:31

diekhans reviewed Sep 10, 2019

View reviewed changes

kislyuk approved these changes Sep 16, 2019

View reviewed changes

brianraymor reviewed Sep 17, 2019

View reviewed changes

rfcs/text/0000-dss-bundle-enumeration.md Outdated Show resolved Hide resolved

brianraymor reviewed Sep 18, 2019

View reviewed changes

brianraymor mentioned this pull request Sep 18, 2019

RFC: DSS Events #102

Open

xbrianh self-assigned this Sep 18, 2019

address comments

18ddfca

xbrianh force-pushed the bhannafi-dss-bundle-enumeration.md branch from 91785bf to 18ddfca Compare September 19, 2019 15:20

samanehsan approved these changes Sep 30, 2019

View reviewed changes


		## Detailed Design

		A new bundle enumeration endpoint, `GET /bundles`, will be introduced, taking replica and prefix parameters. A listing


		* As a downstream service developer, I would like to check if my index contains all the bundles in the DSS.

		## Detailed Design

Conversation

xbrianh commented Aug 9, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

diekhans commented Aug 22, 2019

Uh oh!

brianraymor commented Aug 30, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

samanehsan left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

xbrianh commented Aug 9, 2019 •

edited

Loading

samanehsan left a comment •

edited

Loading