Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix display of "Snapshotting" in dynamic UI #10797

Merged
merged 4 commits into from
Sep 24, 2020

Conversation

gshuflin
Copy link
Contributor

@gshuflin gshuflin commented Sep 16, 2020

Problem

A recent change to the string representation of the Key type affected the user_facing_name of the Snapshot variant of NodeKey, resulting in the dynamic UI displaying "Snapshotting: PathGlobs" rather than a useful path.

Solution

Irrespective of the string representation of a, Key, the Snapshot variant of NodeKey doesn't need to contain a Key at all; it can contain a PathGlobs object. The code that ran a Snapshot Node was already parsing the Key into a PathGlobs (and handling potential errors in that process), so it was easy enough to move that parsing into the location where we create a Snapshot from a Python Value. Several traits had to be implemented on PathGlobs to maintain the trait implementations that Snapshot already had.

Result

Restores the correct behavior for printing the path(s) in a Snapshot in the dynamic UI.

@gshuflin
Copy link
Contributor Author

https://asciinema.org/a/360203 shows that snapshot paths are being printed in the dynamic UI again.

@Eric-Arellano
Copy link
Contributor

Excellent! There are still some issues, such as the Finding files (FileListing) node, but that doesn't need to block. This is a great improvement.

Screen Shot 2020-09-16 at 5 32 47 PM

Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I think I see from here how I would fix the Paths node.

@@ -388,7 +388,7 @@ pub enum SymlinkBehavior {
Oblivious,
}

#[derive(Debug)]
#[derive(Debug, Clone, Eq, PartialEq, Hash)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised to see Eq implemented - I was reading that that's barely implemented.

Out of curiosity, how do you determine when you do and do not do it? I've read about the difference, but am curious if you have any heuristics you rely on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a potentially large object: you should probably not implement Eq/Hash on it. The reason Key was used in this position is that it is much, much smaller (just an integer).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, how do you determine when you do and do not do it? I've read about the difference, but am curious if you have any heuristics you rely on.

In this case it was because Snapshot already implemented Eq, and I didn't dig into why that's the case; maybe PartialEq would be enough. The distinction between PartialEq and Eq in Rust basically exists in order to support the floating point number types; IEEE 754 specifies that floating-point numbers must satisfy NaN != NaN, which is inconsistent with the requirement on the Eq trait that a == a for all a, which PartialEq doesn't have. Since these types aren't floating point numbers and don't have that weird quirk of floating point numbers, I don't think the distinction matters for anything we're doing.

@coveralls
Copy link

coveralls commented Sep 17, 2020

Coverage Status

Coverage remained the same at 0.0% when pulling 69957e4 on gshuflin:snapshot_display into 7f7f73d on pantsbuild:master.

Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure that impl Display for Key should actually display the Python repr for the Key.

If there are particular codepaths (such as the one for Snapshot, or other workunit rendering code) that don't want to display the "entire" underlying object, I think that they should use the EngineAware trait to conditionally render Key's/Values based on their types.

A middle ground would be to have impl Display for Key use the EngineAware trait under the hood... I'm less sure that that is a good behavior though, because it feels like the EngineAware trait should be specific to workunit creation.

@@ -388,7 +388,7 @@ pub enum SymlinkBehavior {
Oblivious,
}

#[derive(Debug)]
#[derive(Debug, Clone, Eq, PartialEq, Hash)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a potentially large object: you should probably not implement Eq/Hash on it. The reason Key was used in this position is that it is much, much smaller (just an integer).

@gshuflin
Copy link
Contributor Author

I'm pretty sure that impl Display for Key should actually display the Python repr for the Key.

If there are particular codepaths (such as the one for Snapshot, or other workunit rendering code) that don't want to display the "entire" underlying object, I think that they should use the EngineAware trait to conditionally render Key's/Values based on their types.

A middle ground would be to have impl Display for Key use the EngineAware trait under the hood... I'm less sure that that is a good behavior though, because it feels like the EngineAware trait should be specific to workunit creation.

This is what you suggested on #10658, right? I think this might make sense - change the Display impl to use the EngineAware.debug_hint method if it exists, and display the entire representation of a value if it doesn't, rather than just the type name. This means that if we discover some area of pants debug logging where we print a gigantic python value representation, we can fix it by implementing EngineAware.debug_hint for whatever type that is, which will require a commit but is more straightforward than messing with the engine.

That said, I still think it makes sense not to use a Key as the inner type for the Snapshot variant. When we create a Snapshot value in path_globs_to_digest, we have the opaque Python value that will eventually become a Rust PathGlobs object. It makes sense that we should do all the validation of that Python object there (and fail immediately if it's not a valid PathGlobs python object), rather than carrying around an opaque Key within Snapshot and doing the potentially-failing operation of turning it into a PathGlobs later.

PathGlobs also strikes me as likely to not be all that big - it's basically two boolean-equivalent types and a Vec<String>. If we could control the size of that Vec<String>, we would be working with a pretty small type. Maybe we could make it an Arc<Vec<String>> (or if there are good reasons not to want PathGlobs itself to work that way, we could just make Snapshot something like : Snapshot { globs: Arc<Vec<String>>, strict_match_behavior: StrictGlobMatching, conjunction: GlobExpansionConjunction}, and have an infallible function Snapshot -> PathGlobs). What do you think @stuhood @Eric-Arellano ?

@stuhood
Copy link
Member

stuhood commented Sep 21, 2020

A middle ground would be to have impl Display for Key use the EngineAware trait under the hood... I'm less sure that that is a good behavior though, because it feels like the EngineAware trait should be specific to workunit creation.

This is what you suggested on #10658, right? I think this might make sense - change the Display impl to use the EngineAware.debug_hint method if it exists, and display the entire representation of a value if it doesn't, rather than just the type name. This means that if we discover some area of pants debug logging where we print a gigantic python value representation, we can fix it by implementing EngineAware.debug_hint for whatever type that is, which will require a commit but is more straightforward than messing with the engine.

Maybe. At that point you have two ways to change the display implementation of the type: 1) changing __repr__, 2) implementing EngineAware.debug_hint. You'd want to be able to motivate having those be different in some cases.

But as I mentioned in the second half of the thing you quoted: the EngineAware trait feels workunit specific, and I'm not sure that it makes sense to use it to implement Display in all cases. But I don't feel strongly about it.

PathGlobs also strikes me as likely to not be all that big

This change feels like it should be tackled independently of the Display issue, in other PRs that do that for some particular performance or correctness reason.

@gshuflin
Copy link
Contributor Author

PathGlobs also strikes me as likely to not be all that big

This change feels like it should be tackled independently of the Display issue, in other PRs that do that for some particular performance or correctness reason.

Well in this case the motivation to change the interior type of Snapshot was motivated by a correctness reason - because of how we were printing the string representation of a Snapshot, we were no longer showing the file system paths that a given Snapshot value was reading from the disk in the dynamic UI. The actual fix in this PR is changing how user_facing_paths works for Snapshot. Simultaneously I realized that Snapshot(Key) was too general of a type, given that Snapshot only should contain one specific Python type and we can enforce that with the Rust type system. I think this is a separate issue from figuring out exactly where we should use EngineAware methods when we're printing the string value of a Key in general.

[ci skip-build-wheels]
# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
[ci skip-build-wheels]
[ci skip-build-wheels]
@gshuflin gshuflin merged commit 703a509 into pantsbuild:master Sep 24, 2020
@gshuflin gshuflin deleted the snapshot_display branch September 24, 2020 02:39
Eric-Arellano pushed a commit to Eric-Arellano/pants that referenced this pull request Sep 24, 2020
### Problem

A recent change to the string representation of the `Key` type affected the `user_facing_name` of the `Snapshot` variant of `NodeKey`, resulting in the dynamic UI displaying "Snapshotting: PathGlobs" rather than a useful path.

### Solution

Irrespective of the string representation of a, `Key`, the `Snapshot` variant of `NodeKey` doesn't need to contain a `Key` at all; it can contain a `PathGlobs` object. The code that ran a `Snapshot` Node was already parsing the `Key` into a `PathGlobs` (and handling potential errors in that process), so it was easy enough to move that parsing into the location where we create a `Snapshot` from a Python `Value`. Several traits had to be implemented on `PathGlobs` to maintain the trait implementations that `Snapshot` already had.

### Result

Restores the correct behavior for printing the path(s) in a `Snapshot` in the dynamic UI.

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants