-
-
Notifications
You must be signed in to change notification settings - Fork 645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Anonymize sandbox paths instead of stripping completely #20492
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking at this!
Would reformat src/python/example.py | ||
Would reformat test.py | ||
Would reformat custom_tmpdir.py | ||
Would reformat /<sandbox-1>/src/python/example.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, looking at this I wonder if this may be suboptimal for a few reasons:
- editors that scan tooling output for paths are less likely to understand these as referring to something within the project being edited
- humans may be more confused e.g. someone who's just been told to use this "pants" tool by a team lead won't necessarily know where these non-existing paths are coming from
- any adhoc scripts that parse output may be broken
Some ideas, not mutually exclusive:
- only add the disambiguation if there's more than one sandbox (doesn't fully solve the "can't trust the output" problem, though)
- format it differently, e.g.
(sandbox-1) src/python/example.py
so that the sandbox disambiguation isn't part of the path - a reference to
pants
in the replacement, so it's obvious where it comes from (e.g.pants-sandbox-1
) - Something else?
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<sandbox-1>
was meant to tackle 2, as it's clearly not-a-dir. I think (1) is a no-go from my POV, that's back to the original issue of "my printed env has the wrong paths?!?". (3) is fine, that's a minor edit to the wording.
Other things... I don't have lots of good ideas. One could maybe imagine some blanking characters instead with full paths, or /tmp/****/
-- pick a random blanking char -- but I imagine that's equally confusing. I did consider making full/partial blanking an option (i.e. pants.toml) but that would be some heavy refactoring at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated it to use <pants-sandbox-##>
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try it. I'm still a bit nervous about the potential impact of the path-like text on IDEs and scripts, but the current situation needs to be improved, so let's try and we can always iterate if people find issues in testing pre-releases 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me.
Looking to future PRs: I suspect that we'll need to add support for detecting these paths (to hyperlink them) in IDE plugins for JetBrains IDEs and possibly others if they do something special with paths in tool output.
When you have a chance, please merge main (or rebase onto it) and add some release notes to docs/notes/2.22.x.md. See #20888 for more info. |
2b8dfb1
to
34c3a56
Compare
@huonw Notes added, PTAL before I merge? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just some minor tweaks (thanks for being one of the early guinea pigs with this new process)
## Bugfixes | ||
|
||
- Sandboxing: Log output from sandboxes will contain anonymized paths instead of completely removing the sandbox prefix. (#20492) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sort of cross-cutting thing often goes in a dedicated "overall" section: https://github.com/pantsbuild/pants/blob/main/docs/notes/2.20.x.md#overall
Maybe you could add one after "new options system" and move it up there? (I personally find it easier to read release notes where the changes are grouped by area rather than by "size"/"type")
Also, I think this may soon (pantsbuild/pantsbuild.org#184) be rendered in places that don't auto-link the GitHub issues:
## Bugfixes | |
- Sandboxing: Log output from sandboxes will contain anonymized paths instead of completely removing the sandbox prefix. (#20492) | |
- Sandboxing: Log output from sandboxes will contain anonymized paths instead of completely removing the sandbox prefix. ([#20492](https://github.com/pantsbuild/pants/pull/20492)) |
(I note that we haven't historically been linking the PRs/issues for everything, although maybe it'd be good to do more consistently.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I think this may soon (pantsbuild/pantsbuild.org#184) be rendered in places that don't auto-link the GitHub issues:
Or, maybe the doc site renderer we use (docusaurus) can do auto-linking. I'm not sure, but it seems plausible 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had the same thought about auto linking during doc site rendering.. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or, maybe the doc site renderer we use (docusaurus) can do auto-linking. I'm not sure, but it seems plausible
I did a quick look through their docs and source code, and I couldn't find any indication of auto-linking support, so we'd have to write a plugin (and/or contribute it upstream)... so, for now, let's just write the links manually to ensure they work everywhere?
I've filed pantsbuild/pantsbuild.org#203
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've just branched for 2.22, so merging this pull request now will come out in 2.23, please move the release notes updates to docs/notes/2.23.x.md
. Thank you!
We've just branched for 2.23, so merging this pull request now will come out in 2.24, please move the release notes updates to |
This is a partial reversal of #19923 which ended up stripping all mentions the sandbox in such a way that debugging some issues becomes very hard. For example this thread on Slack, which was spawned by this thread. In both cases, the stripping of paths led to undebuggable issues because I couldn't trust the output and I couldn't see if paths were correct.
This PR attempts to strike a middle ground where we anonymize the sandbox paths. In practice, this should have as high cache hit rate as before but should leave some breadcrumbs when multiple sandboxes might be involved, and show that absolute paths are absolute, and so on.
Also converts the regex to a non-greedy variant, as it'd otherwise strip too much in a string like
/tmp/pants-sandbox-.../:/tmp/pants-sandbox/...
, removing full first item. Now it will take as few characters as possible, which should behave better.