Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vendor: implement --versioned-dirs #7631

Merged
merged 5 commits into from
Dec 16, 2019
Merged

Conversation

jsgf
Copy link
Contributor

@jsgf jsgf commented Nov 26, 2019

Implement --explicit-version from standalone cargo-vendor. This helps with vendoring
performance as it avoids redundantly deleting and re-copying already vendored packages.

For example, re-vendoring cargo's dependencies it makes a big difference in wallclock
time. For initial vendoring it makes no difference, but re-vendoring (ie, when most or all dependencies haven't changed) without explicit versions is actually slightly slower
(5.8s -> 6s), but with explicit versions it goes from 5.8s -> 1.6s.

Timings:

Without explicit versions, initial vendor
real 0m5.810s
user 0m0.924s
sys 0m2.491s

Re-vendor:
real 0m6.083s
user 0m0.937s
sys 0m2.654s

With explicit versions, initial vendor:
real 0m5.810s
user 0m0.937s
sys 0m2.461s

Re-vendor:
real 0m1.567s
user 0m0.578s
sys 0m0.967s

Its interesting to look at the syscall summary:

Without explicit versions:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 25.17    1.104699          18     59432      1065 openat
 19.86    0.871574          21     41156     13825 unlink
 13.64    0.598739           2    210510           lstat
  9.02    0.395948          29     13208           copy_file_range
  8.00    0.351242          11     30245           read
  6.36    0.279005           3     72487      4476 statx
  5.35    0.235027           6     37219           write
  4.02    0.176267           3     58368           close

with explicit versions:

 29.38    0.419068          15     27798     13825 unlink
 25.52    0.364021           1    209586           lstat
 20.67    0.294788          16     17967      1032 openat
 10.42    0.148586           4     35646           write
  3.53    0.050350           3     13825           chmod
  3.14    0.044786           2     16701      1622 statx
  2.19    0.031171           1     16936           close
  1.86    0.026538          24      1078           rmdir

Specifically, there are a lot fewer opens, copy_file_ranges, and unlinks.

@rust-highfive
Copy link

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 26, 2019
@jsgf jsgf force-pushed the explicit-version branch 2 times, most recently from 54e299c to d866352 Compare November 26, 2019 00:57
@alexcrichton
Copy link
Member

Thanks for the PR! I'm a little surprised though that a motivator for this change is performance, vendoring is typically a pretty rare operation, right? This was originally added to cargo vendor specifically for the directory structure rather than performance

Since this is a proposal for a new stable feature, would it be possible to add tests for this as well, exercising this to make sure it works right?

@alexcrichton
Copy link
Member

I think it'd also be good to write up some documentation to explain this feature and why it might be used.

@jsgf
Copy link
Contributor Author

jsgf commented Nov 26, 2019

Yeah, performance is starting to be a blocker with 900+ packages. It probably wouldn't be for normal use, but since I'm developing a tool for vendored package management using cargo vendor as an underpinning, I'm noticing it pretty strongly. I'm on PTO right now so I haven't checked to see how much this PR improves it, but I was surprised at how much difference it make to vendoring cargo's deps.

The main motivations are:

  • Keep directory names stable. The current behaviour causes spurious deltas in source control for no-op changes (both for the vendored crates, and paths that reference them).
  • Consistent naming. It's actually quite annoying that the most recent version can't be easily determined by eye when looking in the directory.
  • Update performance, particularly for small changes. cargo vendor should be O(change) rather than O(total size of vendored set). I suspect most vendored package directories are suffixless (70% for cargo), so its currently closer to the latter than the former.

I'll add some tests and docs.

@jsgf jsgf force-pushed the explicit-version branch 2 times, most recently from 20c7cc4 to a39a151 Compare November 27, 2019 08:36
@jsgf
Copy link
Contributor Author

jsgf commented Nov 27, 2019

Test and doc added. LMK if you think there's more to be tested/documented.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok thanks! I'll post an fcp merge to see what the team thinks, since this'll be a new stable feature

.sp
\fB\-\-explicit\-version\fP
.RS 4
Normally versions are only added to disamiguate multiple versions of the same package.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"disambiguate"

@alexcrichton alexcrichton added the T-cargo Team: Cargo label Dec 2, 2019
@alexcrichton
Copy link
Member

@rfcbot fcp merge

@rfcbot
Copy link
Collaborator

rfcbot commented Dec 2, 2019

Team member @alexcrichton has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period An FCP proposal has started, but not yet signed off. disposition-merge FCP with intent to merge labels Dec 2, 2019
@jsgf jsgf force-pushed the explicit-version branch from 07d406a to 42df95c Compare December 2, 2019 23:48
src/etc/man/cargo-vendor.1 Show resolved Hide resolved
src/bin/cargo/commands/vendor.rs Show resolved Hide resolved
@jsgf jsgf force-pushed the explicit-version branch from 42df95c to 0c63364 Compare December 3, 2019 07:40
@jsgf jsgf changed the title vendor: implement --explicit-version vendor: implement --versioned-dirs Dec 4, 2019
@jsgf
Copy link
Contributor Author

jsgf commented Dec 4, 2019

I changed the option to --versioned-dirs

src/etc/man/cargo-vendor.1 Outdated Show resolved Hide resolved
@rfcbot rfcbot added final-comment-period FCP — a period for last comments before action is taken and removed proposed-final-comment-period An FCP proposal has started, but not yet signed off. labels Dec 5, 2019
@rfcbot
Copy link
Collaborator

rfcbot commented Dec 5, 2019

🔔 This is now entering its final comment period, as per the review above. 🔔

jsgf and others added 4 commits December 6, 2019 00:53
Implement --explicit-version from standalone cargo-vendor. This helps with
vendoring performance as it avoids redundantly deleting and re-copying
already vendored packages.

For example, when re-vendoring cargo's dependencies it makes a big
improvement on wallclock time. For initial vendoring it makes no
difference, but re-vendoring (ie, when most or all dependencies haven't
changed) without explicit versions is actually slightly slower (5.8s ->
6s), but with explicit versions it goes from 5.8s -> 1.6s.

Timings:

Without explicit versions, initial vendor
real	0m5.810s
user	0m0.924s
sys	0m2.491s

Re-vendor:
real	0m6.083s
user	0m0.937s
sys	0m2.654s

With explicit versions, initial vendor:
real	0m5.810s
user	0m0.937s
sys	0m2.461s

Re-vendor:
real	0m1.567s
user	0m0.578s
sys	0m0.967s

The summaries of syscalls executed shows why:

Revendoring without explicit versions:
```
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 25.17    1.104699          18     59432      1065 openat
 19.86    0.871574          21     41156     13825 unlink
 13.64    0.598739           2    210510           lstat
  9.02    0.395948          29     13208           copy_file_range
  8.00    0.351242          11     30245           read
  6.36    0.279005           3     72487      4476 statx
  5.35    0.235027           6     37219           write
  4.02    0.176267           3     58368           close
```
with explicit versions:
```
 29.38    0.419068          15     27798     13825 unlink
 25.52    0.364021           1    209586           lstat
 20.67    0.294788          16     17967      1032 openat
 10.42    0.148586           4     35646           write
  3.53    0.050350           3     13825           chmod
  3.14    0.044786           2     16701      1622 statx
  2.19    0.031171           1     16936           close
  1.86    0.026538          24      1078           rmdir
```

Specifically, there are a lot fewer opens, copy_file_ranges, and unlinks.
@jsgf jsgf force-pushed the explicit-version branch from 5ac129d to fd80795 Compare December 6, 2019 08:53
@jsgf
Copy link
Contributor Author

jsgf commented Dec 10, 2019

@joshtriplett @ehuss Documentation updated, and I think all comments addressed.

@jsgf jsgf requested a review from joshtriplett December 10, 2019 02:55
@rfcbot
Copy link
Collaborator

rfcbot commented Dec 15, 2019

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC will be merged soon.

@rfcbot rfcbot added finished-final-comment-period FCP complete and removed final-comment-period FCP — a period for last comments before action is taken labels Dec 15, 2019
@alexcrichton
Copy link
Member

@bors: r+

@bors
Copy link
Contributor

bors commented Dec 16, 2019

📌 Commit 2214cf1 has been approved by alexcrichton

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 16, 2019
@bors
Copy link
Contributor

bors commented Dec 16, 2019

⌛ Testing commit 2214cf1 with merge 414ec70...

bors added a commit that referenced this pull request Dec 16, 2019
vendor: implement --versioned-dirs

Implement `--explicit-version` from standalone cargo-vendor. This helps with vendoring
performance as it avoids redundantly deleting and re-copying already vendored packages.

For example, re-vendoring cargo's dependencies it makes a big difference in wallclock
time. For initial vendoring it makes no difference, but re-vendoring (ie, when most or all dependencies haven't changed) without explicit versions is actually slightly slower
(5.8s -> 6s), but with explicit versions it goes from 5.8s -> 1.6s.

Timings:

Without explicit versions, initial vendor
real	0m5.810s
user	0m0.924s
sys	0m2.491s

Re-vendor:
real	0m6.083s
user	0m0.937s
sys	0m2.654s

With explicit versions, initial vendor:
real	0m5.810s
user	0m0.937s
sys	0m2.461s

Re-vendor:
real	0m1.567s
user	0m0.578s
sys	0m0.967s

Its interesting to look at the syscall summary:

Without explicit versions:
```
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 25.17    1.104699          18     59432      1065 openat
 19.86    0.871574          21     41156     13825 unlink
 13.64    0.598739           2    210510           lstat
  9.02    0.395948          29     13208           copy_file_range
  8.00    0.351242          11     30245           read
  6.36    0.279005           3     72487      4476 statx
  5.35    0.235027           6     37219           write
  4.02    0.176267           3     58368           close
```
with explicit versions:
```
 29.38    0.419068          15     27798     13825 unlink
 25.52    0.364021           1    209586           lstat
 20.67    0.294788          16     17967      1032 openat
 10.42    0.148586           4     35646           write
  3.53    0.050350           3     13825           chmod
  3.14    0.044786           2     16701      1622 statx
  2.19    0.031171           1     16936           close
  1.86    0.026538          24      1078           rmdir
```

Specifically, there are a lot fewer opens, copy_file_ranges, and unlinks.
@bors
Copy link
Contributor

bors commented Dec 16, 2019

☀️ Test successful - checks-azure
Approved by: alexcrichton
Pushing 414ec70 to master...

@bors bors merged commit 2214cf1 into rust-lang:master Dec 16, 2019
@jsgf jsgf deleted the explicit-version branch December 16, 2019 21:03
@jsgf jsgf restored the explicit-version branch December 16, 2019 21:04
@ehuss ehuss added this to the 1.42.0 milestone Feb 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge FCP with intent to merge finished-final-comment-period FCP complete S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-cargo Team: Cargo
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants