Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow try/except across @rule methods and intrinsics #10954

Conversation

cosmicexplorer
Copy link
Contributor

@cosmicexplorer cosmicexplorer commented Oct 13, 2020

Problem

Closes #8706.

Why Exception Handling is Good for Plugin Authors

If anything goes wrong in a v2 @rule or in any pants intrinsics, pants will immediately stop all @rule executions and raise the error up to the top level. Even with mypy enabled, idiomatic python code still uses exception handling, so this restriction may be somewhat awkward for plugin authors. This is one of the few remaining elements of normal python code that is not allowed within @rules. It may be difficult to make use of 3rdparty python libraries which use exceptions without this feature.

Idiomatic python code would still prefer to "ask forgiveness rather than permission" (very unfortunate phrasing which Guido van Rossum has since disavowed), and write code for the happy path, and handle the failure case when needed. Since python doesn't have rust-style enums, it is still clumsy to try to turn errors into data. Therefore, it may become more natural to write easily-memoizable pure functions in python when exception handling is enabled.

Why Exception Handling is Bad for Plugin Authors

It should be noted that the v2 engine intentionally (to an extent) forces plugin authors to express rules through a more functional API than they may be used to from other python code, requiring pure functions, for example. This is key to the engine's ability to parallelize and memoize build steps.

Because a @rule which raises an exception, or fails to catch a raised exception from an await Get(...), is still not memoized, we should encourage plugin authors to avoid trying to catch exceptions from deeply-nested @rule invocations, and to handle exceptions close to where they are raised.

Why Exception Handling is Good for Pants Contributors

It is currently annoying and time-consuming to generate error messages for intrinsics implemented in rust. We would like to be able to focus on just generating the minimum amount of information necessary in rust to represent the error, and do the work of formatting that error nicely in python code.

Solution

  • Create the externs::InvocationResult enum, which holds either a successful return value, or a raised exception.
  • In Native.generator_send(), call .send() to send an argument to a coroutine if the argument was a successful return value, or .throw() if it was an exception.
    • In the same method, add a ._formatted_traceback attribute to all exceptions raised from @rule methods.
  • Create core::Failure::join_tracebacks() to join the engine and python tracebacks of any exceptions which were raised when trying to handle another exception.
  • If any Get within a MultiGet fails, raise that error and discard the other results.
  • Convert any Failures from intrinsics into an InvocationResult so they can also be caught.
  • Test that raising an exception while handling another exception is represented clearly to the user.
  • Test that exceptions are still not memoized in the graph.
  • Test that returning an incorrect product type is converted into a TypeError.

Result

We can now do this in @rules:

@rule
async def catch_merge_digests_error(file_input: FileInput) -> MergedOutput:
    # Create two separate digests writing different contents to the same file path.
    input_1 = CreateDigest((FileContent(path=file_input.filename, content=b"yes"),))
    input_2 = CreateDigest((FileContent(path=file_input.filename, content=b"no"),))
    digests = await MultiGet(Get(Digest, CreateDigest, input_1), Get(Digest, CreateDigest, input_2))
    try:
        merged = await Get(Digest, MergeDigests(digests))
    except Exception as e:
        raise Exception(f"error merging digests for input {file_input}: {e}")
    return MergedOutput(merged)

@benjyw
Copy link
Contributor

benjyw commented Oct 13, 2020

I don't know enough about this part of the codebase to actually review, but thanks for the comprehensive and helpful PR description. Makes it easy to understand what this change is for. Just to clarify, we could before, and can still, handle exceptions in the rule in which they are raised, right? This is about handling exceptions across rule boundaries?

Re "we should clearly document how its overuse can counteract several of the properties we've worked hard to build out in the engine." what would be an example? How do we balance those against the benefits?

@cosmicexplorer
Copy link
Contributor Author

Makes it easy to understand what this change is for.

Noted! :D thanks!

Just to clarify, we could before, and can still, handle exceptions in the rule in which they are raised, right? This is about handling exceptions across rule boundaries?

Yes, that's correct. We can also handle exceptions within and from non-@rule functions called by rules, e.g. catching SyntaxError when parsing imports for dependency inference. Pants right now will however take any exception that isn't caught within a @rule itself (or any exception raised within an intrinsic) and immediately bubble it up to the top level in a way that can't be handled.

Re "we should clearly document how its overuse can counteract several of the properties we've worked hard to build out in the engine." what would be an example? How do we balance those against the benefits?

Yes, that part was exceedingly vague and written too quickly. I will think on this and clarify that.

@cosmicexplorer
Copy link
Contributor Author

Have modified the OP to read:

Because a @rule which raises an exception, or fails to catch a raised exception from an await Get(...), is still not memoized, we should encourage plugin authors to avoid trying to catch exceptions from deeply-nested @rule invocations, and to handle exceptions close to where they are raised.

def generator_throw(self, func: RuleCoroutine, arg: Exception) -> EngineGeneratorResult:
return self._generator_send_or_throw_impl(func, arg, self._do_throw)

def _generator_send_or_throw_impl(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a concrete suggestion here, but I have a gut feeling that the logic around using one of generator_send or generator_throw to call into _generator_send_or_throw_impl could possibly be simplified. RuleCoroutine seems like a rather complicated type definition, and I wonder if it's possible to adjust how the rust code invokes these two functions so that we don't need to define it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have slimmed down the type definitions, removed the separate generator_throw() method, and moved the types closer to this function definition. Let me know if this seems to make sense.

@cosmicexplorer cosmicexplorer force-pushed the propagate-exception-stacktrace branch 6 times, most recently from e2f6ef7 to 68fb5b4 Compare October 15, 2020 05:36
@@ -77,6 +63,8 @@ class Externs:
def __init__(self, lib: ModuleType):
self.lib = lib

# NB: This flag is exposed for testing error handling in CFFI methods. This should never be set
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should try to modify our tests so we can get rid of this, but thanks for marking it as testing-only in the meantime

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the exception sink tests are skipped anyway, so hopefully that won't be too difficult.

@gshuflin
Copy link
Contributor

Closes #8706.

It would be good to include a short blurb of text here explaining what this issue is about, so people can get a rough idea without clicking the link.

Idiomatic python code would still prefer to "ask forgiveness rather than permission" (which is an extremely unfortunate phrase that may be against our code of conduct), and write code for the happy path, and handle the failure case when needed.

We should avoid making judgements about possible pants code of conduct violations in pants-external content, unless there's a reason it's specifically relevant to the pants project.

@cosmicexplorer
Copy link
Contributor Author

We should avoid making judgements about possible pants code of conduct violations in pants-external content, unless there's a reason it's specifically relevant to the pants project.

This confuses me because I seem to recall a witch hunt on pants-committers which abused the pants code of conduct to rule on pants-external content instead of just talking to me like a normal human being, which was an exceedingly traumatizing process. I wanted to emphasize it in case someone reads over the PR content and feels like this community is unsafe for them. I have clarified that it is generally accepted that this phrasing is unfortunate.

@gshuflin
Copy link
Contributor

This looks good to me once CI is passing

@stuhood
Copy link
Member

stuhood commented Oct 20, 2020

FWIW, I'm in favor of this one as a way to avoid too many contortions to add context to error messages. Haven't reviewed yet, but can this afternoon.

Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Would be good to see if at least one of the new types could be inlined or removed to try to reduce complexity a bit here, but other than that it looks great!

Comment on lines +67 to +74
def create_value_error(self, msg: str) -> IntrinsicError:
"""Given a utf8 message string, create an exception signalling a value error."""
return IntrinsicError(msg)

def create_type_error(self, msg: str) -> IncorrectProductError:
"""Given a utf8 message string, create an exception signalling a type error."""
return IncorrectProductError(msg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The convention for "calling a constructor" has moved toward declaring the type on (Py)Types (

types = PyTypes(
file_digest=FileDigest,
snapshot=Snapshot,
paths=Paths,
file_content=FileContent,
digest_contents=DigestContents,
path_globs=PathGlobs,
merge_digests=MergeDigests,
add_prefix=AddPrefix,
remove_prefix=RemovePrefix,
create_digest=CreateDigest,
digest_subset=DigestSubset,
download_file=DownloadFile,
platform=Platform,
multi_platform_process=MultiPlatformProcess,
process_result=FallibleProcessResultWithPlatform,
coroutine=CoroutineType,
session_values=SessionValues,
interactive_process_result=InteractiveProcessResult,
engine_aware_parameter=EngineAwareParameter,
)
and
py_class!(class PyTypes |py| {
data types: RefCell<Option<Types>>;
def __new__(
_cls,
file_digest: PyType,
snapshot: PyType,
paths: PyType,
file_content: PyType,
digest_contents: PyType,
path_globs: PyType,
merge_digests: PyType,
add_prefix: PyType,
remove_prefix: PyType,
create_digest: PyType,
digest_subset: PyType,
download_file: PyType,
platform: PyType,
multi_platform_process: PyType,
process_result: PyType,
coroutine: PyType,
session_values: PyType,
interactive_process_result: PyType,
engine_aware_parameter: PyType
) -> CPyResult<Self> {
Self::create_instance(
py,
RefCell::new(Some(Types {
directory_digest: externs::type_for(py.get_type::<externs::fs::PyDigest>()),
file_digest: externs::type_for(file_digest),
snapshot: externs::type_for(snapshot),
paths: externs::type_for(paths),
file_content: externs::type_for(file_content),
digest_contents: externs::type_for(digest_contents),
path_globs: externs::type_for(path_globs),
merge_digests: externs::type_for(merge_digests),
add_prefix: externs::type_for(add_prefix),
remove_prefix: externs::type_for(remove_prefix),
create_digest: externs::type_for(create_digest),
digest_subset: externs::type_for(digest_subset),
download_file: externs::type_for(download_file),
platform: externs::type_for(platform),
multi_platform_process: externs::type_for(multi_platform_process),
process_result: externs::type_for(process_result),
coroutine: externs::type_for(coroutine),
session_values: externs::type_for(session_values),
interactive_process_result: externs::type_for(interactive_process_result),
engine_aware_parameter: externs::type_for(engine_aware_parameter),
})),
)
}
});
) and then calling it via unsafe_call: eg:
externs::unsafe_call(
context.core.types.platform,
&[externs::store_utf8(&platform_name)],
),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to try to make this change!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

src/python/pants/engine/internals/scheduler_test.py Outdated Show resolved Hide resolved
src/rust/engine/src/core.rs Outdated Show resolved Hide resolved
src/rust/engine/src/externs/mod.rs Outdated Show resolved Hide resolved
src/rust/engine/src/nodes.rs Outdated Show resolved Hide resolved
src/rust/engine/src/scheduler.rs Outdated Show resolved Hide resolved
@cosmicexplorer cosmicexplorer force-pushed the propagate-exception-stacktrace branch 2 times, most recently from f74a99c to ff964d3 Compare November 17, 2020 04:31
Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!

Comment on lines +1006 to +1007
/// Execute the @rule function with its resolved params, converting any Result<Value, Failure> into a
/// Result<Value, Failure> to avoid lots of wrapping and unwrapping.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment might be stale now?

Base automatically changed from master to main March 19, 2021 19:20
@thejcannon
Copy link
Member

@cosmicexplorer I'm gonna close this stale PR. GitHub ensures it is immortalized, so if it ever needs to be picked up again, your changes haven't gone anywhere.

@thejcannon thejcannon closed this Jun 15, 2022
kaos added a commit to kaos/pants that referenced this pull request Jan 2, 2023
kaos added a commit that referenced this pull request Jan 4, 2023
* reviving #10954, superseding #17745, closing #17910 

Co-authored-by: Danny McClanahan <1305167+cosmicexplorer@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants