Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support heartbeats from app code for work-item renewal #34

Open
cgillum opened this issue Oct 10, 2023 · 0 comments
Open

Support heartbeats from app code for work-item renewal #34

cgillum opened this issue Oct 10, 2023 · 0 comments

Comments

@cgillum
Copy link
Member

cgillum commented Oct 10, 2023

Problem

Each of the supported backends currently has a lock timeout which is used to detect when a remote app worker may have crash or otherwise become unresponsive. However, the simple timeout mechanism doesn't take into account whether the app has gone away or whether the task is simply taking a long time to complete.

For example, if the lock expiration timeout is 1 minute, but a particular activity task takes 5 minutes to complete, then the lock on that work-item will expire before the activity completes and the activity may be rescheduled unnecessarily.

Proposal - heartbeats

To solve this problem, we propose adding a "heartbeat" callback that activity implementations can use to signal that they're still actively processing a particular work-item. This would be a gRPC API that SDKs can call periodically to renew the lock expiration time for an activity work-item.

As a secondary feature, the heartbeat could be used to get the status of the parent orchestration. If the parent orchestration has been terminated, the activity could then choose to cooperatively terminate itself (details TBD on how this would work for each language SDK).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant