-
Notifications
You must be signed in to change notification settings - Fork 6
add non-retryable errors, and shutdown helpers #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
f6f6a40 to
de68786
Compare
| self._stub = stubs.TaskHubSidecarServiceStub(channel) | ||
| self._logger = shared.get_logger("client", log_handler, log_formatter) | ||
|
|
||
| def __enter__(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add context manager option for clean closing
| # gRPC timeout mapping (pytest unit tests may pass None explicitly) | ||
| grpc_timeout = None if (timeout is None or timeout == 0) else timeout | ||
|
|
||
| # If timeout is None or 0, skip pre-checks/polling and call server-side wait directly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
improves resource consumption on server side that might also lag behind client side
| pass | ||
|
|
||
|
|
||
| class NonRetryableError(Exception): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a new helper, that is present in Temporal but not us, where we can defined errors that are non-retryable so activities don't attempt to retry when raised
| next_delay_f, self._retry_policy.max_retry_interval.total_seconds() | ||
| ) | ||
| return timedelta(seconds=next_delay_f) | ||
| return timedelta(seconds=next_delay_f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this fixes a bug with retry, as the login in line 400 above f datetime.utcnow() < retry_expiration: means that we should retry, but as this was badly indented if for some reason max_retry_interval is not none this was not working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is also kind of mentioned in one of the gotchas in dapr/python-sdk#836, I found this bug beforehand, the other gotchas are gotchas or not-explained behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added some info in README to cover the gotchas, but we might need to add to python-sdk
| @@ -0,0 +1,16 @@ | |||
| apiVersion: dapr.io/v1alpha1 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needed for e2e tests with dapr that should substitute durabletask-go tests with dapr setup
df310f8 to
d9ed06e
Compare
|
@acroca ptal |
Signed-off-by: Filinto Duran <1373693+filintod@users.noreply.github.com>
d9ed06e to
7321905
Compare
Signed-off-by: Filinto Duran <1373693+filintod@users.noreply.github.com>
| if log_handler is None: | ||
| log_handler = logging.StreamHandler() | ||
| log_handler.setLevel(logging.INFO) | ||
| log_handler.setLevel(logging.DEBUG) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? Won't it get very noisy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, a debugging leftover
| commands = | ||
| !e2e: pytest -m "not e2e" --verbose | ||
| e2e: pytest -m e2e --verbose | ||
| e2e: pytest -m e2e --verbose |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| e2e: pytest -m e2e --verbose | |
| e2e: pytest -m e2e --verbose |
| res: pb.GetInstanceResponse = self._stub.WaitForInstanceCompletion( | ||
| req, timeout=grpc_timeout | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this. grpc_timeout is set to None in both 0 and None cases but if I understand correctly, when timeout is None we wait forever, but timeout 0 won't wait at all, right?
| return current_state | ||
|
|
||
| # Poll for completion with exponential backoff to handle eventual consistency | ||
| import time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move the import to the top of the file 🙏
| if current_state and current_state.runtime_status in [ | ||
| OrchestrationStatus.COMPLETED, | ||
| OrchestrationStatus.FAILED, | ||
| OrchestrationStatus.TERMINATED, | ||
| ]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From https://github.com/dapr/durabletask-go/blob/7f28b2408db77ed48b1b03ecc71624fc456ccca3/api/orchestration.go#L196-L201, CANCELLED is also a condition for a workflow to be considered in a terminal state.
But what's the reason for this check? Why not just call the WaitForInstanceCompletion? You are still sending a call to the runtime to get the current state.
| self._retry_timeout = retry_timeout | ||
| # Normalize non-retryable error type names to a set of strings | ||
| names: Optional[set[str]] = None | ||
| if non_retryable_error_types: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if non_retryable_error_types: | |
| if non_retryable_error_types is not None: |
| if isinstance(t, str): | ||
| if t: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't we check it all at once?
| if isinstance(t, str): | |
| if t: | |
| if isinstance(t, str) and len(t)>0: |
| self._channel_options = channel_options | ||
| self._stop_timeout = stop_timeout | ||
| # Track in-flight activity executions for graceful draining | ||
| import threading as _threading |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this import to the top of the file 🙏
| current_reader_thread.start() | ||
| loop = asyncio.get_running_loop() | ||
| while not self._shutdown.is_set(): | ||
| try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why this try was removed. If I understand correctly, the exceptions that were captured here will now be captured outside of the while, right? Why is this preferred now?
| """ | ||
| end: Optional[float] = None | ||
| if timeout is not None: | ||
| import time as _t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move all the imports to the top please
This is a split from asyncio PR #13 . Removing changes not related to asyncio changes