Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vec::into_boxed_slice assumes len() == capacity() #29847

Closed
Swatinem opened this issue Nov 15, 2015 · 12 comments
Closed

Vec::into_boxed_slice assumes len() == capacity() #29847

Swatinem opened this issue Nov 15, 2015 · 12 comments
Labels
T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@Swatinem
Copy link
Contributor

Vec::into_boxed_slice calls shrink_to_fit internally.
That method explicitly states in the docs that it may leave some excess capacity, which the current implementation however does not.

I have a PR coming soon that will actually make use of alloc::usable_size to actually use excess capacity.
But that fails, because into_boxed_slice calls out to RawVec::into_box which actually assumes len() == capacity() and thus leads to segfaults.

@Gankra
Copy link
Contributor

Gankra commented Nov 15, 2015

This is correct behaviour as far as I'm concerned. It's relying on implementation details that we aren't externally promising.

@Gankra
Copy link
Contributor

Gankra commented Nov 15, 2015

I think the correct resolution to this issue is to guarantee that when using the global allocator, shrink_to_fit and with_capacity are precise for Vec. This is significantly more useful to users of Vec than acquiring whatever loose capacity (particularly since they usually don't need or want it when using these interfaces).

@Gankra Gankra added I-nominated T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Nov 15, 2015
@Gankra
Copy link
Contributor

Gankra commented Nov 15, 2015

I'm nominating this issue for discussion at the next libs meeting. I think we should hammer out our allocator vagueness story. In particular, our docs are being highly conservative by claiming that Vec::with_capacity, Vec::shrink_to_fit, and Vec::reserve_exact may produce a larger capacity than requested. To my knowledge, this is in anticipation of:

  • interesting optimizations
  • local allocators

The original reservations here were, I believe, established by strcat long long ago, before I even started working on the project. strcat obviously understands allocators a lot, so I'm assuming there was some important insight here, and not just a vague reservation. That said I might be mistaken. If anyone knows the background here, I'd really appreciate chiming in.

CC @rust-lang/libs @pcwalton @pnkfelix

@Gankra
Copy link
Contributor

Gankra commented Nov 15, 2015

I'm also curious if people consider it more important for Vec to behave in a concrete way (perfect capacity) or use as much bonus space as possible. Personally I think the former is more valuable. Particularly since determining the bonus capacity isn't a trivial operation (so this will bloat up all calls to grow a Vec).

@Swatinem
Copy link
Contributor Author

Just for reference, mozillas nsTArray rounds to the new power-of-two for small and the next MiB for large allocations: https://dxr.mozilla.org/mozilla-central/source/xpcom/glue/nsTArray-inl.h#148-166

@eddyb
Copy link
Member

eddyb commented Nov 15, 2015

Sounds like the bug here is that RawVec::into_box doesn't take the length.
AFAIK it's an entirely unstable and mostly internal API, so the fix should be easy.

@bluss
Copy link
Member

bluss commented Nov 15, 2015

@gankro I'm sure strcat was accounting for usable_size as well, it's certainly been mentioned before

@Swatinem
Copy link
Contributor Author

There is also the counterpart to RawVec::into_box to consider here: Box<[T]>::into_vec. Would it be safe to just use the same usable_size calculation when converting back from Box<[T]> to Vec?

Also, alloc::deallocate takes the old_size of the allocation as well, but it explicitly states:

The old_size parameter may be any value in range_inclusive(requested_size, usable_size).

@briansmith
Copy link
Contributor

I'm also curious if people consider it more important for Vec to behave in a concrete way (perfect capacity) or use as much bonus space as possible.

I vote for "bonus space". Otherwise, it would be necessary to provide an API to calculate a capacity that doesn't waste the "bonus space." This eliminates unnecessary reallocations. I see no concrete advantage to having capacity() return exactly what was requested.

@ranma42
Copy link
Contributor

ranma42 commented Nov 19, 2015

I assume that the documentation for deallocate is based on the one from jemalloc. We might want to make the requirements somewhat different (for example, when deallocating we might require the same size used when allocating) in order to also accomodate other allocators.

I think a good compromise might be to have with_capacity, reserve_exact and shrink_to_fit use "exact" amounts of memory, while the other allocation methods would use the usual allocation policy+excess capacity. shrink_to_fit would then guarantee len() == capacity() upon return; the other two methods would guarantee that the capacity is exactly as requested. I think these methods should be sufficient for most use cases where accurate control over the size of the allocation is needed.

@Gankra
Copy link
Contributor

Gankra commented Nov 19, 2015

I've created #29931 to discuss the broader problem of whether we should take advantage of the slack space or not. This issue now just tracks the fact that we're relying on implementation details in a fragile way.

(discussion was getting blended up too much over the two issues)

@alexcrichton
Copy link
Member

I'm actually gonna go ahead and close this in favor of #29931 for now as this is fine as an implementation detail today and it'll basically just maybe come up depending on how #29931 is resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants