-
Notifications
You must be signed in to change notification settings - Fork 536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RetriableDocumentStorageService is not handling retries correctly #4861
Labels
Milestone
Comments
Also we need to ensure there is telemetry in place (number of attempts, total duration, if possible - last error we deal with). |
vladsud
changed the title
RetriableDocumentStorageService.write() is not handling retries correctly
RetriableDocumentStorageService is not handling retries correctly
Jan 21, 2021
Moving the telemetry part to other issue: #4855 |
This was referenced Oct 17, 2022
This was referenced Oct 17, 2022
This was referenced Oct 17, 2022
This was referenced Oct 18, 2022
This was referenced Oct 18, 2022
This was referenced Oct 20, 2022
This was referenced Oct 20, 2022
This was referenced Oct 31, 2022
This was referenced Oct 31, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
APIs know to be broken:
In particular, we do not handle 429s correctly, or any other retryable error.
Can we move away from using DocumentStorageServiceProxy here? It hides details, it's not obvious on initial review what APIs are wrapped vs. not.
As other new APIs would get added, developers would keep missing the need to change that class.
Note that while it's right place to solve these issues, it creates another problem - too much noise in telemetry when we are offline. That's because actual driver calls will generate events, and if we call them every 8 seconds, they will generate a ton of data for couple hours of being offline (all of that data is cached and flushed to Aria on connection).
We likely need to create a separate issue to think about solution to this problem.
For example, op fetching does not log anything in the driver, DeltaManager is responsible for all logging. Maybe we need to adopt similar thing and say that RetriableDocumentStorageService is doing all loging?
The text was updated successfully, but these errors were encountered: