-
Notifications
You must be signed in to change notification settings - Fork 494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CosmosException message sometimes contains full stack traces #1507
Comments
By any chance do you know what operation is causing it? @ealsur is this possibly the same bug you already fixed with the task.yield change in the direct package? |
Task.Yield would not affect the actual stack trace produced (just how it is maintained in the stack/heap), and 429s are exception-less. @majastrz Are you changing the retry configuration? Our retries are based on async/await, so if a 429 happens, we retry, but 429s don't produce an exception in themselves. The Retry handlers inspect the Response and if it's a 429, they retry it. CreateItemAsync So the exception might contain the complete stack, since the CreateItemAsync (or whatever operation) was initiated. The thing here is that there could be other Retries because of other reasons (connection blip, partition split, etc), so truncating the stack is really tricky (what do you truncate and why?). |
@ealsur No, we are using the default retry configuration currently. We didn't add any custom handlers to the client's pipeline, either. Is it possible/feasible for you to use InnerException or create an InnerExceptions property (like on AggregateException) to store all of the attempts instead of dumping it into the message? And would disabling the default retry policy be a viable workaround for this behavior? |
I still have one of the captured exception messages. Here's the start: Response status code does not indicate success: 429 Substatus: 3200 Reason: (Microsoft.Azure.Cosmos.Query.Core.Monads.ExceptionWithStackTraceException: TryCatch resulted in an exception. ---> Microsoft.Azure.Cosmos.Query.Core.Monads.ExceptionWithStackTraceException: TryCatch resulted in an exception. ---> Microsoft.Azure.Cosmos.Query.Core.Monads.ExceptionWithStackTraceException: TryCatch resulted in an exception. ---> Microsoft.Azure.Cosmos.Query.Core.Monads.ExceptionWithStackTraceException: TryCatch resulted in an exception. ---> Microsoft.Azure.Cosmos.CosmosException : Exception of type 'Microsoft.Azure.Cosmos.CosmosException' was thrown.\r\nStatusCode = 429;\r\nSubStatusCode = 3200;\r\nActivityId = dd8b6358-5ee5-4ba6-990f-4ec4a25f6445;\r\nRequestCharge = 30.01;\r\n\r\n --- End of inner exception stack trace ---\r\n at Microsoft.Azure.Cosmos.Query.Core |
That's a throttle too, seems to come from a query. |
This might be a result of the query exception handling where it is creating a stack trace. @ealsur this was changed in version 3.7.1 with #1298. I believe previous version do contain the stack trace. |
Describe the bug
We have noticed that in throttling conditions when the built-in retries are exhausted, the Cosmos DB SDK sometimes throws a CosmosException with the message containing full stack traces.
To Reproduce
I don't have a minimal repro here and I don't fully understand what causes the SDK to return the stack traces, but it doesn't always happen.
The issue actually caused an outage in our service. We have several background jobs that append their status (including exception messages) to a document in our collection. In throttling conditions, we would append the "final" CosmosException message to that document. Unfortunately, we saw that this exception message would sometimes exceed 100KB. Combined with further retries and failures, it would grow the document to over 600KB making it more and more expensive to upsert and prolonging the outage.
Expected behavior
Exception messages should not be exceeding 100KB and should not contain full stack traces. (We truncate the exception messages before saving them, but we should not have had to do that.)
Actual behavior
In some cases, the SDK throws a CosmosException with messages over 100KB containing full stack traces.
Environment summary
SDK Version: 3.5.1
OS Version: Windows
Additional context
N/A
The text was updated successfully, but these errors were encountered: