Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anonymize sandbox paths instead of stripping completely #20492

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

tgolsson
Copy link
Contributor

@tgolsson tgolsson commented Feb 5, 2024

This is a partial reversal of #19923 which ended up stripping all mentions the sandbox in such a way that debugging some issues becomes very hard. For example this thread on Slack, which was spawned by this thread. In both cases, the stripping of paths led to undebuggable issues because I couldn't trust the output and I couldn't see if paths were correct.

This PR attempts to strike a middle ground where we anonymize the sandbox paths. In practice, this should have as high cache hit rate as before but should leave some breadcrumbs when multiple sandboxes might be involved, and show that absolute paths are absolute, and so on.

Also converts the regex to a non-greedy variant, as it'd otherwise strip too much in a string like /tmp/pants-sandbox-.../:/tmp/pants-sandbox/..., removing full first item. Now it will take as few characters as possible, which should behave better.

@tgolsson tgolsson requested review from kaos and huonw February 5, 2024 21:17
@tgolsson tgolsson added the category:internal CI, fixes for not-yet-released features, etc. label Feb 5, 2024
Copy link
Contributor

@huonw huonw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking at this!

src/python/pants/util/strutil.py Outdated Show resolved Hide resolved
src/python/pants/util/strutil.py Outdated Show resolved Hide resolved
Would reformat src/python/example.py
Would reformat test.py
Would reformat custom_tmpdir.py
Would reformat /<sandbox-1>/src/python/example.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, looking at this I wonder if this may be suboptimal for a few reasons:

  • editors that scan tooling output for paths are less likely to understand these as referring to something within the project being edited
  • humans may be more confused e.g. someone who's just been told to use this "pants" tool by a team lead won't necessarily know where these non-existing paths are coming from
  • any adhoc scripts that parse output may be broken

Some ideas, not mutually exclusive:

  1. only add the disambiguation if there's more than one sandbox (doesn't fully solve the "can't trust the output" problem, though)
  2. format it differently, e.g. (sandbox-1) src/python/example.py so that the sandbox disambiguation isn't part of the path
  3. a reference to pants in the replacement, so it's obvious where it comes from (e.g. pants-sandbox-1)
  4. Something else?

Thoughts?

Copy link
Contributor Author

@tgolsson tgolsson Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<sandbox-1> was meant to tackle 2, as it's clearly not-a-dir. I think (1) is a no-go from my POV, that's back to the original issue of "my printed env has the wrong paths?!?". (3) is fine, that's a minor edit to the wording.

Other things... I don't have lots of good ideas. One could maybe imagine some blanking characters instead with full paths, or /tmp/****/ -- pick a random blanking char -- but I imagine that's equally confusing. I did consider making full/partial blanking an option (i.e. pants.toml) but that would be some heavy refactoring at this point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated it to use <pants-sandbox-##> instead.

Copy link
Member

@kaos kaos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

src/python/pants/util/strutil_test.py Outdated Show resolved Hide resolved
@tgolsson tgolsson requested a review from huonw February 6, 2024 20:18
Copy link
Contributor

@huonw huonw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try it. I'm still a bit nervous about the potential impact of the path-like text on IDEs and scripts, but the current situation needs to be improved, so let's try and we can always iterate if people find issues in testing pre-releases 👍

Copy link
Member

@cognifloyd cognifloyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me.

Looking to future PRs: I suspect that we'll need to add support for detecting these paths (to hyperlink them) in IDE plugins for JetBrains IDEs and possibly others if they do something special with paths in tool output.

@huonw
Copy link
Contributor

huonw commented May 10, 2024

When you have a chance, please merge main (or rebase onto it) and add some release notes to docs/notes/2.22.x.md. See #20888 for more info.

@tgolsson tgolsson force-pushed the ts/keep-sandbox-prefix branch from 2b8dfb1 to 34c3a56 Compare May 13, 2024 16:34
@tgolsson
Copy link
Contributor Author

@huonw Notes added, PTAL before I merge?

Copy link
Contributor

@huonw huonw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just some minor tweaks (thanks for being one of the early guinea pigs with this new process)

Comment on lines +71 to +73
## Bugfixes

- Sandboxing: Log output from sandboxes will contain anonymized paths instead of completely removing the sandbox prefix. (#20492)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sort of cross-cutting thing often goes in a dedicated "overall" section: https://github.com/pantsbuild/pants/blob/main/docs/notes/2.20.x.md#overall

Maybe you could add one after "new options system" and move it up there? (I personally find it easier to read release notes where the changes are grouped by area rather than by "size"/"type")

Also, I think this may soon (pantsbuild/pantsbuild.org#184) be rendered in places that don't auto-link the GitHub issues:

Suggested change
## Bugfixes
- Sandboxing: Log output from sandboxes will contain anonymized paths instead of completely removing the sandbox prefix. (#20492)
- Sandboxing: Log output from sandboxes will contain anonymized paths instead of completely removing the sandbox prefix. ([#20492](https://github.com/pantsbuild/pants/pull/20492))

(I note that we haven't historically been linking the PRs/issues for everything, although maybe it'd be good to do more consistently.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think this may soon (pantsbuild/pantsbuild.org#184) be rendered in places that don't auto-link the GitHub issues:

Or, maybe the doc site renderer we use (docusaurus) can do auto-linking. I'm not sure, but it seems plausible 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had the same thought about auto linking during doc site rendering.. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, maybe the doc site renderer we use (docusaurus) can do auto-linking. I'm not sure, but it seems plausible

I did a quick look through their docs and source code, and I couldn't find any indication of auto-linking support, so we'd have to write a plugin (and/or contribute it upstream)... so, for now, let's just write the links manually to ensure they work everywhere?

I've filed pantsbuild/pantsbuild.org#203

Copy link
Contributor

@huonw huonw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've just branched for 2.22, so merging this pull request now will come out in 2.23, please move the release notes updates to docs/notes/2.23.x.md. Thank you!

@huonw
Copy link
Contributor

huonw commented Sep 11, 2024

We've just branched for 2.23, so merging this pull request now will come out in 2.24, please move the release notes updates to docs/notes/2.24.x.md. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category:internal CI, fixes for not-yet-released features, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants