-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a page_size keyword to container.values(), keys(), items() #645
Comments
Thanks for writing this up. I have no notes on the first part of the proposal ( I would suggest a minor change to the second part. The methods What about adding a method to This extends our own iterables, which already have custom behavior in |
See also #42 |
Something to be mindful of with paging is whether the results could change from call to call. For a fixed set of results, will they always be returned in the same order? If not, maybe it's simply the client's responsibility to first sort as needed before paginating. For a writeable catalog, how does an added/deleted item affect the "pages" that get returned? I think I've seen schemes that include an item id in the pagination request. As in, give me the 10 items before uid-xxxx, or the next 20 items after uid-yyyy. Others use a database-like cursor id to track the position in the results stream. |
@padraic-shafer These are good questions, but I want to be clear that I'm not implementing paging myself. There is already a page_size parameter in the HTTP interface to Tiled that takes care of the paging, and presumably already has an opinion about both of your questions. What this issue proposes is just to have the Python API start using the page_size parameter in a more dynamic way, rather than have page_size always be set to 100. @danielballan Adding a page_size method seems fine, if you don't want to mess too much with the dictview format. It fits decently well with the paradigm of adding ever more functions to narrow a catalog down. I.e, |
I copied @padraic-shafer's comment (a good comment!) to #647 and responded there. We can keep this issue focused on exposing, via the Python client, the existing pagination options supported by the server. For now, just added a line to the reference docs would be fine, @cjtitus: tiled/docs/source/reference/python-client.md Lines 32 to 33 in c76d1b3
Later we can include this in a new how-to guide on performance-tuning for metadata requests. |
Summary
We should add a page_size keyword to the
values()
,keys()
, anditems()
methods of Tiled containers to allow for advanced control of the amount of data downloaded at one time.Background
While testing python code to access Tiled repositories, it was noticed that
catalog.values()[index]
is much slower than
catalog[uid]
despite being the same basic sort of operation. Indeed, the Tiled documentation https://blueskyproject.io/tiled/reference/python-client.html even claims to support "efficient random access".
Digging into the Tiled requests, it seems that all calls to
catalog.values()
request the default page_size of 100. This means that 100 values will be fetched, even if we know we are only going to access one.Proposal
I propose a two-fold change to the data access. First,
catalog._keys_slice
andcatalog._items_slice
(which actually fetch the data) would be updated so that they request a page_size ofmax(start - stop, DEFAULT_PAGE_SIZE)
. This would instantly make all calls tovalues()
,items()
, andkeys()
more efficient.Second, for advanced use cases, it would be easy to add a
page_size
keyword to thevalues()
,items()
, andkeys()
functions that could override the default page size. This would be especially useful when populating a table in a GUI with search results from a catalog, where the correct page_size should really be equal to the number of rows that the GUI requests to update.Comments welcome.
The text was updated successfully, but these errors were encountered: