-
Notifications
You must be signed in to change notification settings - Fork 13
/complete #45
Comments
I assume you also mean /organizations/everything would give you a list of ALL Organizations and ALL of their subresources? If so then you probably want to add paging to this method, in the spec, as smart implementers would certainly want to prevent a request from slamming their database in the case of large datasets. |
I think yesterday @klambacher told me that pagination would be a serious
hindrance when requesting ALL. Am i mixing issues here?
|
My concern is that pagination is very high risk in data synchronization activities. Accuracy relies on the data gathering activity being an atomic operation, with the records in the set remaining unchanged while the operation occurs. Often pagination is implemented through a requery which can result in subsequent pages not being an exact continuation of the previous request. It's not that it can't be managed, but I would be unlikely to trust a 3rd parties pagination implementation for a bulk operation without understanding implemention details. My preference for bulk data sync is to have a low-data query (ids and types) that identifies ALL records to collect via search, establishing an set to operate on. Then the full records can be fetched all at once or in batches safely. |
@klambacher yep that is a reasonable approach. Then there's also the topic about bulk transfers, for consideration: #58 |
Couple of concepts in play here:
In the end, adding a /everything /complete /all for all core objects satisfies needs for a complete representation of an resource and sub-resource, with basic level pagination. There will be a GET, POST at this level as well, but should not be used for volume, or bulk loading -- just app level integrations. I have opted for an API design /everything /complete /all, over a ?scope=all, for caching and performance. When it is a path, caching, and performance becomes much more of reality at the webserver level, where dynamic queries are pulled each time. Assisting operators with their performance concerns. Which compliments the separation of system bulk loading concerns. At this point I'm, as I'm going with /complete, so:
Keep core /[resource] 100% reflecting HSDS, and /[resource]/complete reflecting top level, and sub-level resource. I leave to HSDS data team to accept new schema back into master or not. |
Thanks @kinlane Two thoughts: (1) I have a preference for I.e. it's possible to have a 'full' response that is not necessarily 'complete' with everything that is known about a service or location. (2) Is the choice of sub-level resources programatically derived from the foreign-key relationships in HSDS, or have you had to make editorial choices here? If there are updates we should put back into HSDS to enable a good Single Source of Truth approach to these major and sub-resource relationships, happy to look at those. |
Thanks for feedback.
|
@kinlane can you outline (or point me to) the reasoning behind what information is included and what is not in the default (i.e. not /complete) representation of resources? One concern I have seen related to this is that no two clients of the API will agree on what is important and not important in the response, so the |
The reasoning was to keep the default reflecting HSDA, and keeping flat so CSV could be negotiated -- lowest bar, directly to spreadsheet. Then provide everything. Beyond that, no consensus on other views, or approach to allow for schema filtering, and with so much on table for that version it was pushed to the future for further discussion. Caching is definitely one of the concerns. Open to feedback, and continued discussion for whether should be in v1.3. Thanks! |
Complete will go away in v2.0 in favor of the addition of resources property that allows user to choose if they want any sub-resources returned, with the default being no sub-resources. |
I didn't quite see any feedback regarding the options on the table for allowing for a more comprehensive approach to schema filtering, so I'm just going to go with the most intuitive option, and add an /everything path to /organizations, /locations, /contacts, and /services.
Ie. when you do /organizations/ you get a simple, flat schema.
when you do /organizations/everything you get a complete schema with all subresources.
We can revisit header options for this, and other approaches further down the road, this should support the concerns on the table.
The text was updated successfully, but these errors were encountered: