Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use allow-dirty option in cargo package to skip vcs checks #6280

Merged
merged 2 commits into from
Nov 29, 2018

Conversation

zachreizner
Copy link
Contributor

@zachreizner zachreizner commented Nov 7, 2018

If cargo package is run for a crate that is not version controlled in
a way that cargo expects, git2 will overreach upwards and grab a
irrelevant .git repo. This change uses --allow-dirty to imply to cargo package that it should not be checking for version control information, fixing this issue.

@dwijnand
Copy link
Member

dwijnand commented Nov 8, 2018

r=me, but there are general concerns on introducing too many flags to cargo and also there's a soft feature freeze in effect, so I'm not sure if we should land this.

But the change itself makes sense and looks reasonable! 🙂

@zachreizner
Copy link
Contributor Author

This isn't so much adding a feature as it is removing a feature that was introduced in an August commit. The latest cargo introduced this vcs behavior in cargo package and ended up breaking our continuous integration of Rust packages in Chrome OS. It's incredibly important that CI for rust packages works on Chrome OS, so I ask that you may an exception to the soft feature freeze. I would be happy to discuss this further, if need be.

@Eh2406
Copy link
Contributor

Eh2406 commented Nov 8, 2018

@zachreizner, Do you happen to know what PR caused the problem? (You say it is from August so I thought you may know.) Can you make a reproducible example, so that we can track it down? Do you have more information on what problems the current behavior is causing you? Can you add a test to this PR?

@zachreizner
Copy link
Contributor Author

The PR was #5886. I don't have a good example, but I will work on one if that helps.
The problem with the behavior of cargo package today is that it violates the portage ebuild sandbox used to verify that a package (in the linux package manage sense) is only using files inside of its own working directory during the build phase. The way the sandbox works is by checking what files get touched by the build process, rather than outright preventing such invalid access. I know this system sounds rather fragile, but it's important for saving headaches later down the line with non-reproducible builds. Also, every other build system is able to stay within its working directory, and I think cargo should be no different.

@dwijnand
Copy link
Member

dwijnand commented Nov 8, 2018

I think we should be able to fix this by pushing the check_repo_state call under opts.list (which early returns) and under if !opts.allow_dirty. Wouldn't that fix your use case @zachreizner? Can anyone spot any drawbacks?

@dwijnand
Copy link
Member

dwijnand commented Nov 8, 2018

Soft echo that accompanying this the change with a test would be 🔝.

@zachreizner
Copy link
Contributor Author

When you say under if !opts.allow_dirty, do you mean that if --allow-dirty is passed in, no vcs info checking will occur? If so, that seems like it would solve the issue.

@dwijnand
Copy link
Member

dwijnand commented Nov 8, 2018

Want to try a local build with that change to confirm?

@alexcrichton
Copy link
Member

Thanks for the PR @zachreizner! I think we're always up for taking PRs that fix bugs!

I would personally be inclined, though, to fix this perhaps not with an option to cargo publish but rather with better inference behavior on behalf of Cargo. This seems like a flag that would only be passed in pretty niche use cases, so it'd be best if we didn't have to add a new flag at all!

For example I think we could improve our inference and say something like:

  • Take the current package being published and its workspace root. The manifests of these two packages have a common path as an ancestor.
  • Walk up the filesystem looking for a git repository to at most that common ancestor, but look no further.

I think that would help prevent reaching too high in your case perhaps? (most crates are their own workspace so they wouldn't look much farther than that directory)

@zachreizner
Copy link
Contributor Author

I apologize, but my analysis of the problem our build system was facing is slightly incorrect. The issue is actually that the ~/trunk/src/platform2 directory, which is where many of our Rust and non-Rust packages are placed, has a .git directory that looks like this:

~/trunk/src/platform2 $ ls -lh .git
-rw-r--r-- 1 zachr eng 1.2K Nov  8 16:45 COMMIT_EDITMSG
lrwxrwxrwx 1 zachr eng   48 Oct 30  2015 config -> ../../../.repo/projects/src/platform2.git/config
lrwxrwxrwx 1 zachr eng   67 Oct 30  2015 description -> ../../../.repo/project-objects/chromiumos/platform2.git/description
-rw-r--r-- 1 zachr eng  128 Oct  3 16:08 FETCH_HEAD
-rw-r--r-- 1 zachr eng   32 Nov  7 11:29 HEAD
lrwxrwxrwx 1 zachr eng   61 Oct 30  2015 hooks -> ../../../.repo/project-objects/chromiumos/platform2.git/hooks
-rw-r--r-- 1 zachr eng 573K Nov  8 16:45 index
lrwxrwxrwx 1 zachr eng   60 Oct 30  2015 info -> ../../../.repo/project-objects/chromiumos/platform2.git/info
lrwxrwxrwx 1 zachr eng   46 Oct 30  2015 logs -> ../../../.repo/projects/src/platform2.git/logs
-rw-r--r-- 1 zachr eng    0 Nov  8 16:45 MERGE_RR
lrwxrwxrwx 1 zachr eng   63 Oct 30  2015 objects -> ../../../.repo/project-objects/chromiumos/platform2.git/objects
-rw-r--r-- 1 zachr eng   41 Nov  7 11:29 ORIG_HEAD
lrwxrwxrwx 1 zachr eng   53 Oct 30  2015 packed-refs -> ../../../.repo/projects/src/platform2.git/packed-refs
lrwxrwxrwx 1 zachr eng   46 Oct 30  2015 refs -> ../../../.repo/projects/src/platform2.git/refs
lrwxrwxrwx 1 zachr eng   64 Oct 30  2015 rr-cache -> ../../../.repo/project-objects/chromiumos/platform2.git/rr-cache
lrwxrwxrwx 1 zachr eng   49 Oct 30  2015 shallow -> ../../../.repo/projects/src/platform2.git/shallow
lrwxrwxrwx 1 zachr eng   59 Oct 30  2015 svn -> ../../../.repo/project-objects/chromiumos/platform2.git/svn

All of those symlinks are outside the sandbox, which is what triggers our problem. Therefore anything that preempts searching inside of the .git directory prevents the sandbox violation. My original PR with --no-vcs does this, as would @dwijnand's suggestion to have allow_dirty preempt any vcs_info checking. @alexcrichton's suggestion to only search as high as the workspace and package's common ancestor would also prevent searching this .git directory, I think. Am I correct in assuming that the simple case for this hypothetical ceiling directory would be the same directory as where the package's Cargo.toml is?

For example, we have a package in ~/trunk/src/platform2/vm_tools/9s with a Cargo.toml file at ~/trunk/src/platform2/vm_tools/9s/Cargo.toml. Assuming no explicit workspace, the ceiling for searching for a .git directory is simply ~/trunk/src/platform2/vm_tools/9s. Does that sound reasonable?

@Eh2406
Copy link
Contributor

Eh2406 commented Nov 9, 2018

If someone is publishing with allow_dirty and a dirty tree then what good is the cargo_vcs_info? So +1 for reusing the allow_dirty flag.

@dekellum
Copy link
Contributor

dekellum commented Nov 9, 2018

Is there an error message available from cargo for the original issue, here? Is it just that the found parent repo is in fact dirty? I don't follow how #5886 (I authored) would have changed that behavior?

@Eh2406: specifically, if check_repo_state finds a repo that is dirty, then cargo_vcs_info is not generated.

@alexcrichton
Copy link
Member

Ok thanks for the explanation @zachreizner, makes sense that my previous suggestion wouldn't work. I think we can't say, though, that a git directory is only searched for in the Cargo.toml directory because crates are often located throughout a git repository (and aren't always in a workspace).

At this point the best solution may be to update cargo and/or git2 to respect env vars like GIT_CEILING_DIRECTORIES, which I believe is already a standard-ish env var in with git, and in this system I imagine y'all could configureGIT_CEILING_DIRECTORIES to be beneath the git root here.

@zachreizner
Copy link
Contributor Author

To summarize we have the following proposals, as I understand them:

  • Add --no-vcs as an explicit flag (my original PR),
  • Reuse --allow-dirty to imply that VCS info should not be checked (@Eh2406's suggestion I think)
  • Respect GIT_CEILING_DIRECTORIES inside of git2 and/or cargo (@alexcrichton's most recent suggestion).
  • Use the common ancestor of Cargo.toml and the workspace directory as an implied GIT_CEILING_DIRECTORIES. (@alexcrichton's first suggestion).

I believe all of these will solve the issue I'm having, and so I've ordered them from easiest to hardest. Which of these solution would the cargo team find most willing to merge?

@dwijnand
Copy link
Member

Looking back I'm not sure my --allow-dirty suggestion is sane (Alex suggestion alternatives is a subtle sign of this 🙂). So probably best look into GIT_CEILING_DIRECTORIES support in git2.

@alexcrichton
Copy link
Member

Sorry I should also elaborate more as well. I would personally prefer to not add a new feature (flag) for this since it seems like a pretty niche feature. I'm trying to brainstorm other ideas we can adapt existing concepts and/or existing ideas to avoid adding a new feature/flag.

I'd personally be ok with the adaptation of --allow-dirty as well, it seems slightly odd to me but so do all the various routes here

@dwijnand
Copy link
Member

dwijnand commented Nov 13, 2018

it seems slightly odd to me but so do all the various routes here

Given GIT_CEILING_DIRECTORIES is in the official git docs it seems like the less ad-hoc solution.

@zachreizner
Copy link
Contributor Author

After considering the options, I realized that GIT_CEILING_DIRECTORIES has a corner case that would still lead to sandbox violations through the .git directory's symlinks if the source directory being packaged has the .git directory. The docs for GIT_CEILING_DIRECTORIES says " It will not exclude the current working directory or a GIT_DIR." The git_repository_discover function has a similar warning.

Because of this realization, I decided to implement the --allow-dirty approach that disables vcs_info gathering. This avoid adding a flag to cargo, and I think that behavior makes sense.

If `cargo package` is run for a crate that is not version controlled in
a way that cargo expects, git2 will overreach upwards and grab a
irrelevant .git repo. This change adds `--no-vcs` as an option to
`cargo package` to prevent this.
To avoid introducing another flag, this change uses the `--allow-dirty`
flag to skip checking for vcs information. This is logical because a
user that passes that flag is indicating to `cargo package` that they
do not care about the state of vcs at that time.
@dtolnay
Copy link
Member

dtolnay commented Nov 28, 2018

LGTM. Invoking with --allow-dirty means that the state of the vcs is not relevant to what is being packaged, so it makes sense not to look at the vcs state.

Copy link
Member

@dwijnand dwijnand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM too. Want to update the PR description, then we can merge this.

@zachreizner zachreizner changed the title add --no-vcs option to cargo package use allow-dirty option in cargo package to skip vcs checks Nov 28, 2018
@dtolnay
Copy link
Member

dtolnay commented Nov 29, 2018

Title has been updated. Do we want an update added to #6280 (comment) as well?

@dwijnand
Copy link
Member

Thanks @zachreizner!

@bors: r+

@bors
Copy link
Contributor

bors commented Nov 29, 2018

📌 Commit fd5fb6e has been approved by dwijnand

@bors
Copy link
Contributor

bors commented Nov 29, 2018

⌛ Testing commit fd5fb6e with merge e70bccf...

bors added a commit that referenced this pull request Nov 29, 2018
use allow-dirty option in `cargo package` to skip vcs checks

If `cargo package` is run for a crate that is not version controlled in
a way that cargo expects, git2 will overreach upwards and grab a
irrelevant .git repo. This change uses `--allow-dirty` to imply to `cargo package` that it should not be checking for version control information, fixing this issue.
@bors
Copy link
Contributor

bors commented Nov 29, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: dwijnand
Pushing e70bccf to master...

@bors bors merged commit fd5fb6e into rust-lang:master Nov 29, 2018
bors added a commit to rust-lang/rust that referenced this pull request Dec 4, 2018
Update cargo, rls

26 commits in b3d0b2e545b61d4cd08096911724b7d49d213f73..5e85ba14aaa20f8133863373404cb0af69eeef2c
2018-11-15 19:13:04 +0000 to 2018-12-02 14:37:25 +0000
- ConflictStoreTrie: Faster filtered search (rust-lang/cargo#6366)
- Remove `cmake` as a requirement (rust-lang/cargo#6368)
- progress: display "Downloading 1 crate" instead of "Downloading 1 crates" (rust-lang/cargo#6369)
- Use expect over unwrap, for panic-in-panic aborts (rust-lang/cargo#6364)
- Switch to pretty_env_logger, under --features pretty-env-logger (rust-lang/cargo#6362)
- use allow-dirty option in `cargo package` to skip vcs checks (rust-lang/cargo#6280)
- remove clones made redundant by Intern PackageId (rust-lang/cargo#6352)
- docs: correct profile usage of `cargo test --release` (rust-lang/cargo#6345)
- Improve doc for `cargo install` (rust-lang/cargo#6354)
- Intern PackageId (rust-lang/cargo#6332)
- Clean only release artifacts if --release option is set (rust-lang/cargo#6349)
- remove clones made redundant by Intern SourceId (rust-lang/cargo#6347)
- Intern SourceId (rust-lang/cargo#6342)
- Tweak Layout to allow for non json file targets with internal "." (rust-lang/cargo#6255)
- Correct Target Directory command-line option (rust-lang/cargo#6343)
- Persistent data structures by im-rs (rust-lang/cargo#6336)
- Move command prelude into main library (rust-lang/cargo#6335)
- Distinguish custom build invocations (rust-lang/cargo#6331)
- Allow crate_type=bin examples to run (rust-lang/cargo#6330)
- Make verify-project honour unstable features (rust-lang/cargo#6326)
- Make autodiscovery disable inferred targets (rust-lang/cargo#6329)
- Add `c` alias for `check` (rust-lang/cargo#6218)
- Allow user aliases to override built-in aliases (rust-lang/cargo#6259)
- Fix renaming directory project using build scripts with cross-compiling. (rust-lang/cargo#6328)
- Fix add_plugin_deps-related tests. (rust-lang/cargo#6327)
- Add a glossary. (rust-lang/cargo#6321)
@ehuss ehuss added this to the 1.32.0 milestone Feb 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants