Actor deactivate errors causing app to crash #627

MrMint · 2024-08-26T17:28:51Z

Expected Behavior

The lack of an activated actor to deactivate should not cause the app to crash. I would also expect any errors in the onDeactivateInteral call to not cause the app to crash.

js-sdk/src/actors/runtime/ActorManager.ts

Line 85 in 5cefcf1

await actor.onDeactivateInternal();

Actual Behavior

We are seeing a situation where our actor services go into a crash backoff loop which is fueled by errors during deactivate. It would appear that daprd attempts to call deactivate for an actor that does not exist in the service, which results in the service crashing due to this being thrown:

js-sdk/src/actors/runtime/ActorManager.ts

Lines 76 to 81 in 5cefcf1

    
           throw new Error( 
        
             JSON.stringify({ 
        
               error: "ACTOR_NOT_ACTIVATED", 
        
               errorMsg: `The actor ${actorId.getId()} was not activated`, 
        
             }), 
        
           );

Once it has crashed, k8s will go into a back-off restart on the container, which is basically endless as any further deactivate calls will also fail given the app has crashed/restarted and lost all the actor references.

Steps to Reproduce the Problem

Send a DELETE request for an actor with an ID that does not exist. This will crash the app.

daprd logs showing the described actual from it's pov:

{"app_id":"actors","level":"debug","msg":"Deactivated actor 'redact||redact'","scope":"dapr.runtime.actor","type":"log","ver":"1.14.1"}
{"app_id":"actors","level":"debug","msg":"Deactivated actor 'redact||redact'","scope":"dapr.runtime.actor","type":"log","ver":"1.14.1"}
{"app_id":"actors","level":"error","msg":"Failed to deactivate actor redact||redact: Delete \"http://127.0.0.1:3000/actors/redact/redact\": EOF","scope":"dapr.runtime.actor","type":"log","ver":"1.14.1"}
{"app_id":"actors","level":"error","msg":"Error performing request: Get \"http://127.0.0.1:3000/healthz\": dial tcp 127.0.0.1:3000: connect: connection refused","scope":"actorshealth","type":"log","ver":"1.14.1"}
{"app_id":"actors","level":"error","msg":"Failed to deactivate actor redact||redact: Delete \"http://127.0.0.1:3000/actors/redact/redact\": dial tcp 127.0.0.1:3000: connect: connection refused","scope":"dapr.runtime.actor","type":"log","ver":"1.14.1"}
{"app_id":"actors","level":"error","msg":"Failed to deactivate actor redact||redact: Delete \"http://127.0.0.1:3000/actors/redact/redact\": dial tcp 127.0.0.1:3000: connect: connection refused","scope":"dapr.runtime.actor","type":"log","ver":"1.14.1"}
{"app_id":"actors","level":"error","msg":"Failed to deactivate actor redact||redact: Delete \"http://127.0.0.1:3000/actors/redact/redact\": dial tcp 127.0.0.1:3000: connect: connection refused","scope":"dapr.runtime.actor","type":"log","ver":"1.14.1"}

The text was updated successfully, but these errors were encountered:

MrMint linked a pull request Aug 26, 2024 that will close this issue

feat: Adds error handling around actor deactivation #628

Draft

3 tasks

github-staff deleted a comment from Lxx-c Oct 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Actor deactivate errors causing app to crash #627

Actor deactivate errors causing app to crash #627

MrMint commented Aug 26, 2024

Actor deactivate errors causing app to crash #627

Actor deactivate errors causing app to crash #627

Comments

MrMint commented Aug 26, 2024

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem