-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ActorInit] Fix Bug in Actor creation #32277
Conversation
python/ray/exceptions.py
Outdated
@@ -263,14 +263,16 @@ class RayActorError(RayError): | |||
but it is there as a safety check. | |||
""" | |||
|
|||
def __init__(self, cause: Union[RayTaskError, ActorDiedErrorContext] = None): | |||
def __init__(self, cause: Union[RayTaskError, ActorDiedErrorContext, str] = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. But it's better to fix it at this line https://github.com/ray-project/ray/blob/master/python/ray/_raylet.pyx#L778
by passing an ActorDiedErrorContext instead of adding a new type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern with doing this is that the RayActorError
will not have actor_init_failed=True
.
I tried updating this function to return a RayTaskError
instead, but then only the traceback is only printed in the str() call, not the cause
.
I could do something hacky like:
raise RayActorError.from_task_error(
RayTaskError(
function_name=f"{class_name}.__init__",
traceback_str=f"Failed to create the actor {core_worker}. "
"The failure reason is that you set the async flag, "
"but the actor has no any coroutine function.",
cause=""
)
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should set actor_init_failed=True
because this flag indicates that the actor failed when executing __init__
.
For our case, the actor should fail before executing __init__
.
CC @rkooo567 for double check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! Thanks for the clarification @jovany-wang! Will move this to ActorDiedErrorContext
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: Ian <ian.rodney@gmail.com>
This reverts commit f3ed4cbcc5310c49112aed1ffb3a302fb70cebbb. Signed-off-by: Ian <ian.rodney@gmail.com>
Signed-off-by: Ian <ian.rodney@gmail.com>
8152170
to
ee47c76
Compare
Failed tests are flaky/failing on master. |
In ray-project#28149 RayActorError is called with a str as cause, but this is not an accepted type. This leads to hitting the assertion error in the else case: assert isinstance(cause, ActorDiedErrorContext) on L283. Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Why are these changes needed?
RayActorError
with a str as cause, but this is not an accepted type. This leads to hitting the assertion error in theelse
case:assert isinstance(cause, ActorDiedErrorContext)
on L283.Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.