Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make CosmosOperationCanceledException Serializable #3402

Closed
brianwp3000 opened this issue Aug 22, 2022 · 4 comments · Fixed by #3433
Closed

Make CosmosOperationCanceledException Serializable #3402

brianwp3000 opened this issue Aug 22, 2022 · 4 comments · Fixed by #3433
Assignees
Labels
feature-request New feature or request

Comments

@brianwp3000
Copy link

brianwp3000 commented Aug 22, 2022

My team just upgraded from 3.20.1 to 3.29.0 and we are now running into problems because CosmosOperationCanceledException no longer works with the Microsoft Orleans binary serializer.

Azure-sdk-for-net ran into this same issue a few years ago, and solved it by implementing Serializable on their Exception classes.

It's a best practice to always make Exception classes Serializable. Without this, my team has to work around the issue by catching all CosmosExceptions and wrapping them in another Exception that implements Serializable.

@ealsur
Copy link
Member

ealsur commented Aug 23, 2022

@brianwp3000 CosmosOperationCanceledException inherits from OperationCanceledException (not from CosmosException). It was added so OperationCanceledExceptions thrown by cancellation tokens or the HttpClient included the Diagnostics.

Before, the SDK would throw OperationCanceledException.

Were you able to serialize OperationCanceledException before?

@brianwp3000
Copy link
Author

Ok did some more investigation on my end:

We are running a service on Microsoft Orleans, which runs code on distributed processes called "grains". Anything passed between these grains has to be serialized, including Exceptions.

Back when we were running CosmosSDK 3.20.1 we were still running into CosmosOperationCanceledException that Orleans couldn't serialize. But, Orleans would throw a NullReferenceException during serialization and we caught that exception gracefully:

System.NullReferenceException: Object reference not set to an instance of an object.
   at Microsoft.Azure.Cosmos.CosmosOperationCanceledException.get_Message()
   at Orleans.Serialization.ILBasedExceptionSerializer.Serialize(Object item, ISerializationContext outerContext, Type expectedType) in /_/src/Orleans.Core/Serialization/ILBasedExceptionSerializer.cs:line 133
   at Orleans.Serialization.SerializationManager.FallbackSerializer(Object raw, ISerializationContext context, Type t) in /_/src/Orleans.Core/Serialization/SerializationManager.cs:line 1661
   at Orleans.Serialization.SerializationManager.SerializeInner[TObject,TContext,TWriter](SerializationManager sm, TObject obj, Type expected, TContext context, TWriter writer) in /_/src/Orleans.Core/Serialization/SerializationManager.cs:line 962
   at Orleans.Serialization.BuiltInTypes.SerializeOrleansResponse(Object obj, ISerializationContext context, Type expected) in /_/src/Orleans.Core/Serialization/BuiltInTypes.cs:line 2131
   at Orleans.Serialization.SerializationManager.SerializeInner[TObject,TContext,TWriter](SerializationManager sm, TObject obj, Type expected, TContext context, TWriter writer) in /_/src/Orleans.Core/Serialization/SerializationManager.cs:line 962
   at Orleans.Runtime.Messaging.MessageSerializer.OrleansSerializer`1.Serialize[TBufferWriter](TBufferWriter output, T value) in /_/src/Orleans.Core/Messaging/MessageSerializer.cs:line 180
   at Orleans.Runtime.Messaging.MessageSerializer.Write[TBufferWriter](TBufferWriter& writer, Message message) in /_/src/Orleans.Core/Messaging/MessageSerializer.cs:line 112
   at Orleans.Runtime.Messaging.Connection.ProcessOutgoing() in /_/src/Orleans.Core/Networking/Connection.cs:line 423   at Orleans.Runtime.OutgoingCallInvoker.Invoke() in /_/src/Orleans.Core/Runtime/OutgoingCallInvoker.cs:line 113
   at Orleans.Runtime.OutgoingCallInvoker.Invoke() in /_/src/Orleans.Core/Runtime/OutgoingCallInvoker.cs:line 113
   at Orleans.Runtime.GrainReferenceRuntime.InvokeWithFilters(GrainReference reference, InvokeMethodRequest request, InvokeMethodOptions options) in /_/src/Orleans.Core/Runtime/GrainReferenceRuntime.cs:line 116
   at ControlPlane.Grains.HelperClasses.VmHeartbeatEventSender.SendDirectMessageToAssignedPool(AssignedPoolGrainKey assignedPoolGrainKey, AssignedVm assignedVm) in D:\a\_work\1\s\src\ControlPlane\ControlPlane.Grains\HelperClasses\VmHeartbeatEventSender.cs:line 103
   at Microsoft.Azure.Gaming.Common.Core.Extensions.TaskExtensions.FireAndForget(Task task, Action`1 exceptionLoggerAction) in D:\a\_work\1\s\src\Common\Core\Core\Extensions\TaskExtensions.cs:line 19,

Now that we've upgraded to CosmosSDK 3.29.0, Orleans is presumably still failing to serialize correctly but it doesn't throw an exception. Instead, we run into the NullReferenceException at runtime when we call ToString() on the CosmosOperationCanceledException:

System.NullReferenceException: Object reference not set to an instance of an object.
   at Microsoft.Azure.Cosmos.CosmosOperationCanceledException.ToString()
   at ControlPlane.Grains.Logging.VmGrainEvents_AutoClass_4268506fd7bc44fe935d39cf5598a9b4.ExceptionOnWriteArchivedSessionDocument(String vmId, String sessionHostId, Exception exception)
   at ControlPlane.Grains.VmGrain.<>c__DisplayClass45_2.<CleanupOldSessionHosts>b__3(Exception e) in D:\a\_work\2\s\src\ControlPlane\ControlPlane.Grains\VmGrain.cs:line 780
   at ControlPlane.Common.ControlPlaneTaskExtensions.FireAndForget(Task task, Action`1 exceptionLoggerAction) in D:\a\_work\2\s\src\ControlPlane\ControlPlane.Common\ControlPlaneTaskExtensions.cs:line 31,

We had to add another try-catch to guard against these new exceptions. Before we added the new try-catch, we got some crash dumps that point to line 103 in CosmosOperationCanceledException.cs as the line that threw the NullReferenceException.

So admittedly Orleans serialization is introducing more complexity here than there needs to be, but all this would be solved if CosmosOperationCanceledException was serializable.

@ealsur
Copy link
Member

ealsur commented Aug 24, 2022

Got it. We can track making CosmosOperationCanceled Serializable if possible.

@ealsur ealsur added feature-request New feature or request and removed needs-more-information labels Aug 24, 2022
@ealsur ealsur changed the title Make CosmosException and its subclasses Serializable Make CosmosOperationCanceledException Serializable Aug 24, 2022
@ealsur
Copy link
Member

ealsur commented Aug 25, 2022

@NaluTripician please see the above comment

@kirankumarkolli kirankumarkolli moved this from Triage to Approved in Azure Cosmos SDKs Aug 25, 2022
Repository owner moved this from Approved to Done in Azure Cosmos SDKs Sep 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request New feature or request
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants