Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Sqlite backend initialization in Dapr Workflow #49

Open
ASHIQUEMD opened this issue Dec 15, 2023 · 5 comments
Open

Issue with Sqlite backend initialization in Dapr Workflow #49

ASHIQUEMD opened this issue Dec 15, 2023 · 5 comments

Comments

@ASHIQUEMD
Copy link

ASHIQUEMD commented Dec 15, 2023

Issue Description

When utilizing Sqlite in-memory as a backend for Dapr Workflow, an issue arises where the application is executed immediately after the workflow is initiated. This results in the following error:

Unhandled exception. Dapr.DaprException: Start Workflow operation failed: the Dapr endpoint indicated a failure. 
See InnerException for details.
 ---> Grpc.Core.RpcException: Status(StatusCode="Internal", Detail="error starting workflow 'OrderProcessingWorkflow': 
unable to start workflow: failed to start orchestration: backend not initialized")

Upon investigation, it was identified that the error is caused by a specific line in the Durable Task for Go library, where the initialization of the Sqlite database takes time. Consequently, when the application starts executing the workflow, it encounters an error due to Sqlite not being ready. Given some time (e.g., 10-15 seconds), the application is able to run the workflow successfully with the Sqlite backend.

Proposed Solution

To address this issue, it is suggested to implement a mechanism similar to the one found in the Actor backend. Specifically, exposing a method in the Backend interface to check whether the backend is ready or not. The proposed addition to the Backend interface is as follows:

type Backend interface {
    ...
    ...
    WaitForBackendReady(context.Context)
}

This method will allow applications to wait until the Sqlite backend is fully initialized before attempting to execute workflows, preventing the encountered error.

Steps to Reproduce

  1. Set up Dapr Workflow with Sqlite in-memory as the backend.
  2. Initiate a workflow immediately after starting the application.
  3. Observe the error mentioned above.
@ASHIQUEMD ASHIQUEMD changed the title Issue with Sqlite in-memory backend initialization in Dapr Workflow Issue with Sqlite backend initialization in Dapr Workflow Dec 15, 2023
@cgillum
Copy link
Member

cgillum commented Dec 15, 2023

Hi @ASHIQUEMD, I feel that this may be an issue with Dapr rather than an issue with this library. From the description it sounds like there’s a race condition where task hub creation and some other backend methods are being called at the same time. If that’s the case, can the Dapr implementation simply synchronize these calls to mitigate the race condition?

@ASHIQUEMD
Copy link
Author

ASHIQUEMD commented Dec 26, 2023

Hi @cgillum, I was debugging this issue, and found that when we start the workflow execution, first we call StartWorkflowBeta1 API which expects backend to be ready, and there is a parallel call to GetWorkItems which actually initializes the sqlite through CreateTaskHub, it's just that sqlite takes time to initialize when it creates those schemas and StartWorkflowBeta1 api calls finishes by the time sqlite is initialized.

To fix this, I have created a channel in workflow engine and closing the channel after task hub creation is complete. And modified StartWorkflowBeta1 to call WorkflowEngine.WaitForWorkflowEngineReady(ctx) instead of calling WaitForActorsReady(ctx), which makes it avoid the race condition for any backend and let the task hub creation complete before continuing with its execution.

@ItalyPaleAle
Copy link
Member

I spoke with @ASHIQUEMD earlier and I understand the problem.

My advice here would be to create for each component an Init method which is invoked synchronously right after the backend is created. Things like database migrations should be performed synchronously before the backend can be used, and not lazily in the Start method which can be invoked multiple times in parallel. This way:

  1. We don't need to be concerned with synchronization mechanisms
  2. We also have a way to catch initialization errors quicker and report them as "fatal" errors (letting Dapr decide what to do)

@ASHIQUEMD
Copy link
Author

@ItalyPaleAle We probably don't need to create Init() method, CreateTaskHub(context.context) actually does the sqlite db creation and it actually is init for backend, so we can call backend.CreateTaskHub(ctx) from dapr after we create SqliteBackend instance.

@ItalyPaleAle
Copy link
Member

@ItalyPaleAle We probably don't need to create Init() method, CreateTaskHub(context.context) actually does the sqlite db creation and it actually is init for backend, so we can call backend.CreateTaskHub(ctx) from dapr after we create SqliteBackend instance.

That seems to be a good solution, the current CreateTaskHub method seems to be an Init-like thing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants