Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching-Options via REST-API #1650

Closed
stefan-niedermann opened this issue Apr 1, 2020 · 11 comments
Closed

Caching-Options via REST-API #1650

stefan-niedermann opened this issue Apr 1, 2020 · 11 comments

Comments

@stefan-niedermann
Copy link
Member

stefan-niedermann commented Apr 1, 2020

@desperateCoder @juliushaertl

The REST-API works great, but it is a huge problem to cache the data. This causes a full synchronization of a client (hmm, let's say Nextcloud Deck for Android 😉) to take up to 30 seconds (with only a few test-boards and -cards) and causing very much traffic.

This issue is about providing the necessairy capabilities for proper caching via the REST-API.

Implement ETag and If-None-Match

I took some time and thought about it. The obvious way is to implement ETag- and If-None-Match-Headers. The basic idea is, that the REST-API has a checksum about the current state of a response and sends it to the client with the first response. The client will send the value of the retrieved ETag-Header as value of a If-None-Match-header at the next request and the server will simply respond with a 304 status code if it matches the current ETag.

Desired behavior

  1. Only need one request given nothing has changed on the server.
  2. Walking in tree-style through the entity hierarchy - only need request for the subtree items that actually have changed

grafik

Problem with the standard implementation

Though there is one problem with the standard implementation: We have multiple entities in our hierarchy:

  1. Account
  2. Board
  3. List
  4. Card

The standard implementation generates a hash from all the properties of the entity (e.g. in the case of a board this might only be title and color but not the contained lists and cards). Therefore we can save the traffic generated by each request but we still have one request for each entity (One per account, one per board, one per list, one per card).

Proposal 1: Implement ETag in a "non-standard" way

The ETag of each entity could not only be generated from the direct properties, like

BoardETag: etag(title, color)

but also from all descendant entities. Sample:

BoardETag: etag(title, color, ListETag[])

where the etag of a list item is generated like

ListETag: etag(title, CardETag[])

and so on.

Pros

This is my personal preferred solution as

  • One does not have to deal with multiple custom HTTP headers
  • One could argue, that the descendant entities actually are properties of the current entity, but they are just not send to the clients in their responses 😉
  • Quoting from Wikipedia: The method by which ETags are generated has never been specified in the HTTP specification. which means, that we are not sooo "non-standardish" but it's just not so common maybe.

Contra

It might be an issue though that one only is interested in boards, then the ETag would change even if nothing of the direct board properties have changed. The negative effect would be more traffic in a use-case where only specific entities are interested rather than performing a whole synchronization

Proposal 2: Implement additional custom headers to achieve the desired behavior

This solution would require to implement ETags in the common way and additionally providing a separate second custom header like

X-ETag-Descendants

and a second custom header like

X-If-None-Descendants-Match

Pros and Contra

Are inversed to Proposal 1.

@desperateCoder
Copy link

I'd like to add one more thing here: The reason why the If-Modified-Since header isn't enough is, that we still won't get deletes, since the API only can provide the data it still has. If the data is physically deleted, i won't get notified about that. The ETag would be my only chance to recognize, that there are changes.

@juliushaertl
Copy link
Member

Thanks a lot for the well thought summary. I also would argue that introducing a custom header doesn't make much sense here. Since as you mention the etag format is basically unspecified, we could probably come up with combining the benefits of 2 into the general etag approach. We would basically always return two hashes as etag "ETagData-ETagDescendants". With that we could return the fitting status for both cases on the server side depending if just the ETagData or both EtagData and ETagDescendants are provided in the If-None-Match header.

@stefan-niedermann
Copy link
Member Author

So basically we define our custom format for the ETag-value by concatenating two separate hashes (splitted by a --character).

If one does not understand the format, the ETags will still work, and if one reads the documentation he can use it for further optimizations on the clients.

I really like the idea!

@stefan-niedermann
Copy link
Member Author

Summary of our chat:

We will go for Proposal 1, but instead of mixing the ETags of the descendants into the hash, we will concatenate the two hashes (splitted by a --character).

Format: ETag-ETagDescendants
Example: 2da850844f302c1772db641d84ae6-7a87bbcee859628d71ba01c1f31221

The only valid format for the If-None-Match-header is the format mentioned above.

Optionally we can think about only providing ETag- or -ETagDescendants for the If-None-Match-header and modify the response depending on the given input, but this is no goal for the first implementation.

@juliushaertl
Copy link
Member

Moving to 1.1.0

@rullzer
Copy link
Member

rullzer commented May 3, 2020

Maybe I'm missing something but I fail to see how this helps. Unless you have the case where nothing changed. However mostlikely a card of list has changed.

What you want is an easy way to detect which board/card/list etc changed.

Would it not be better to have content etags in the API responses.
And then for exaple in the boards response a field that has the etag of that board. That way the client can decide themselves what they need to fetch.

@stefan-niedermann
Copy link
Member Author

stefan-niedermann commented May 3, 2020

@rullzer have a look at the diagram in the issue description. Currently we are walking the whole tree on every sync. This will help detecting us whether or not a "route" of this tree has changed. In an ideal case (nothing changed) we will only need one request. Worst case (each and every card has changed) would cause the same amount of requests as today.

And in all other cases (let's say, only 3 cards from one board have changed) we can save the requests for all the cards, stacks and board where the 3 cards are not in.

@stefan-niedermann
Copy link
Member Author

I would ♥️ to tackle this issue in some kind of a remote-hackweek together 🙂 Anyone interested in planning to spend a day or so together in a talk session and work together on this?

Doesn't have to be immediately, but we should rethink the concept we worked out and maybe get a bit more concrete regarding the actual doing. Personal goal would be to implement this for at least one endpoint (e. g. /boards/{boardId}/stacks) or so.

@juliushaertl
Copy link
Member

I'd also have a use case for ETags in general for #2159 with an addition as mentioned in #925 to also use them for detecting previous changes when updating data.

@stefan-niedermann Let's have a chat next week and see how we can move this forward 😉

@stefan-niedermann
Copy link
Member Author

@stefan-niedermann Let's have a chat next week and see how we can move this forward 😉

Finally! awesome!

I also thought about it and the rather complex approach we described above is great because it would not only minimize the traffic and data, but also reduce the number of requests.

Though i would be absolutely happy to first make ETags the "default" way, as it would incredibly speed up the synchronization. I am really excited and looking forward to a chat. 😆 👍

@juliushaertl juliushaertl added this to the 1.2.0 milestone Nov 11, 2020
@juliushaertl
Copy link
Member

Implemented with #2245

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants