Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segment padding uninitialized memory problems #1032

Closed
divergentdave opened this issue Apr 13, 2020 · 0 comments · Fixed by #1040
Closed

Segment padding uninitialized memory problems #1032

divergentdave opened this issue Apr 13, 2020 · 0 comments · Fixed by #1040

Comments

@divergentdave
Copy link
Collaborator

This is a follow-up from #937 (comment). To recapitulate, IoBufs::write_to_log() does no padding if there is less than 32 bytes available at the end of a segment, and then uses the full length of the segment when writing, which can result in a read from uninitialized memory.

In trying to write a patch for this, I ran into further conundrums. While I haven't tried to reproduce this yet, it seems that IoBufs::write_to_log() might not pad the entirety of the segment when it does pad. First, we have:

let pad_len = capacity - bytes_to_write - MAX_MSG_HEADER_LEN;
...
let padding_bytes = vec![MessageKind::Corrupted.into(); pad_len];

The Cap message header is written from offset bytes_to_write to bytes_to_write + header_bytes.len(). Then, the zeros are written from offset bytes_to_write + header_bytes.len() to bytes_to_write + header_bytes.len() + pad_len. Cancelling terms, the range of bytes that is written ends at offset capacity + header_bytes.len() - MAX_MSG_HEADER_LEN. We know that header_bytes.len() will be less than MAX_MSG_HEADER_LEN, because the len field will have a zero in its top byte, thus there will be uninitialized bytes at the end of the buffer. Moving through the function, if should_pad is true, then maxed was true, and thus total_len is set to capacity, and the uninitialized bytes will be read from and written to disk.

Correcting this issue looks like it would require solving a Catch-22, because making the Cap message extend to the end of the segment would require knowing the length of the Cap message's header, but that would in turn require knowing how long we wanted the message to be, etc. Perhaps this could be solved by using capacity + header_bytes.len() - MAX_MSG_HEADER_LEN as the message's length in its header, checksumming that many bytes in the message, but then zeroing out more bytes until the end of the segment.

A parallel concern is whether the read paths properly handle the red zone of each segment. I noticed that LogIter checks if cur_lsn is within MAX_MSG_HEADER_LEN of the end of a segment, and moves to the next segment. It appears that IoBufs::start(), one of the few other callers of read_message(), does not check for this case. Is such logic required here?

At this point, I'm feeling a little lost in the weeds, would appreciate input on how to tighten up segment writing/reading here.

divergentdave added a commit to divergentdave/sled that referenced this issue Apr 21, 2020
This writes bytes past the end of the last message in a segment (Cap or
otherwise) to fully initialize the buffer. Initializing these bytes
avoids UB from passing a partially initialized &[u8] to pwrite_all().
Fixes spacejam#1032.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant