-
Notifications
You must be signed in to change notification settings - Fork 535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider adding notion of document revisions & more coarse document update model / events #9542
Comments
Here is one perf angle to this:
Most likely explanation of the difference - we spend unbelievable amount of time in applications callbacks (DDS event handlers) - summarizer does not have them. So ability to reduce number of updates to the app is the key in maintaining performance. |
Here is one more angle:
Whiteboard - 231ms
That clearly demonstrates same info (but from different angle) - app callbacks dominate processing time and are limiting factor in how quickly we can process ops, and thus scalability of whole system. |
We may solve problems of slow catch up via slightly different mechanism: |
Whiteboard's async op processing benefits from batching: if the app is being too slow, it will end up with a back log of remote edits to apply, but whiteboard will process the entire backlog before sending an update to the app, which causes it to tend to catch up instead of falling behind (since the per op cost deceases as it falls behind). That said, there are fixed op processing costs that don't decrease this way (ex: actually, changing the internal tree model), and whiteboard (and also fluid?) don't have backpressure, so if it falls behind it can just get worse and worse leaving an unbounded queue of both inbound and outbound ops to process. We need a way to have end to end backpressure somehow (so the user can be informed/delayed when processing or upload/download is falling behind). It seems like maybe #9618 could be applied to help separate DDS vs App caused costs which we might use to inform backpressure handling differently: if the UI for a component can't keep up, unloading its UI and replacing it with a message to use user may have value (like browsers do with unresponsive tabs, but maybe recover once op rate slows), but if the DDS itself can't keep up just not the UI, I think we have to drop out of the collaborators (makes sure we won't be the summerizer, so someone who can keep up will be), and maybe try and reconnect after the next summary (or ops really slow down). Applications (or DDSs) that incur cost of ops async (like whiteboard) may need additional APIs to express that backpressure to fluid. We may also want a prioritization system as part of this (some app operations are more important to avoid delaying that others) Is there a documentation on how we do (or plan to do) backpressure? I curious what the UX is supposed to be when op rate exceeds what some collaborators can handle. |
This PR has been automatically marked as stale because it has had no activity for 60 days. It will be closed if no further activity occurs within 8 days of this comment. Thank you for your contributions to Fluid Framework! |
This issue is opened to collect feedback.
Proposal:
Where it helps:
This would allow runtime / application to implement throttling of updates when the rate of changes exceeds capability of the system. Some scenarios:
Cons:
One big con of that model in inability to reason about order of changes. This needs to be strongly considered.
At the same time, as far as I can tell, that kind of update model is the primary update model in other systems (firebase being an example).
The text was updated successfully, but these errors were encountered: