Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Wording regarding guarantees about Vec's unused capacity is unclear #37746

Closed
antrik opened this issue Nov 13, 2016 · 4 comments
Closed
Labels
A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@antrik
Copy link
Contributor

antrik commented Nov 13, 2016

The "Guarantees" section in the documentation for Vec has some unclear wording, especially in the second-to-last paragraph:

Vec will not specifically overwrite any data that is removed from it, but also won't specifically preserve it. Its uninitialized memory is scratch space that it may use however it wants. It will generally just do whatever is most efficient or otherwise easy to implement.

In my reading, that means we are explicitly not allowed to store anything (e.g. via FFI calls) in the Vec's unused (but allocated) capacity, beyond its current valid length (size) -- but some others are disagreeing with my interpretation.

Specifically, the question is whether doing something like this is invalid:

let buf_size: usize = 666;
unsafe {
    let buf: Vec<u8> = Vec::with_capacity(buf_size);
    let data_size = libc::read(fd, buf.as_mut_ptr() as *mut c_void, buf_size as size_t) as usize;
    assert(data_size >= 0);
    buf.set_len(data_size);
}

As opposed to this variant, which should be valid even in my reading:

let buf_size: usize = 666;
unsafe {
    let buf: Vec<u8> = Vec::with_capacity(buf_size);
    buf.set_len(buf_size);
    let data_size = libc::read(fd, buf.as_mut_ptr() as *mut c_void, buf_size as size_t) as usize;
    assert(data_size >= 0);
    buf.set_len(data_size);
}
@sfackler sfackler added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Nov 13, 2016
@sfackler
Copy link
Member

That approach is fairly pervasive in unsafe code, so it seems like something we'd realistically never be able to break. I'm not totally sure what we would end up using that space for, either. I think we can minimally ensure that this exact approach is allowed - stating that writing to unused capacity and then adjusting the length is valid.

@sfackler
Copy link
Member

cc @rust-lang/libs

@alexcrichton
Copy link
Member

@sfackler what you mentioned sounds good to me, seems nice that we shouldn't loosen the restrictions here too much but it's also the case that we'd never break this in any case.

@antrik
Copy link
Contributor Author

antrik commented Nov 16, 2016

@sfackler well, I can come up with various hypothetical uses -- such as temporary storage during operations moving parts of the Vec<> for example.

The ideas admittedly are becoming more contrived if only looking at reserve() and set_len(), without any other vector operations happening in between... Debugging or other bookkeeping structures could be affected even here; but seem unlikely candidates, as the spare capacity can essentially go down to 0 at any time, which would be rather unhelpful for any kind of bookkeeping for the Vec<> itself I guess. However, I could still imagine guard patterns for detecting buffers overflows for example. (This doesn't seem terribly useful for purely safe Rust -- but it could actually be quite useful precisely when FFI or other unsafe uses come into play...)

Thinking further, my understanding is that the reserved capacity doesn't really guarantee the specified amount of memory to actually be usable; but rather only that it can be made available during vector growth without moving. (I.e. essentially just reserving part of the address space.) That doesn't really mean anything for the current implementation, working on top of malloc() -- but a different implementation more closely integrated with the system allocator might very well use the reserved yet unused capacity for different things, such as free space tracking for example -- or possibly not even map pages into that part of the address space at all until the Vec<> actually grows...

So what I'm saying is that giving any guarantees about availability of the spare capacity definitely does constrain the design space -- though whether that's considered a real concern is a different question of course :-)

@steveklabnik steveklabnik removed the A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools label Mar 10, 2017
@steveklabnik steveklabnik added A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools and removed A-docs labels Mar 23, 2017
steveklabnik added a commit to steveklabnik/rust that referenced this issue Apr 25, 2017
arielb1 pushed a commit to arielb1/rust that referenced this issue Apr 25, 2017
Fix up vec guarnatee around capacity

Fixes rust-lang#37746

r? @rust-lang/libs
frewsxcv added a commit to frewsxcv/rust that referenced this issue Apr 26, 2017
Fix up vec guarnatee around capacity

Fixes rust-lang#37746

r? @rust-lang/libs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants