Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for lengths in ULE and revise safety docs #1121

Merged
merged 5 commits into from
Oct 4, 2021

Conversation

sffc
Copy link
Member

@sffc sffc commented Sep 28, 2021

I realized that with the new byte equality (#1103), we can no longer ignore extra bytes at the end of a byte slice. I also revised the safety docs to make them more actionable by adding a checklist.

@sffc sffc requested a review from a team as a code owner September 28, 2021 07:29
utils/zerovec/src/ule/mod.rs Outdated Show resolved Hide resolved
zbraniecki
zbraniecki previously approved these changes Sep 28, 2021
Self::validate_byte_slice(bytes).map_err(|e| UleError::InvalidBytes(e))?;
if bytes.len() % mem::size_of::<Self>() != 0 {
return Err(UleError::InvalidLength);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question (n-b): In my patches I put this test in validate_byte_slice. This addition means that you don't want the check of length being divisible by size be in validate_byte_slice?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would imagine that this is the job of validate_byte_slice. We could debug_assert here, but I think it's better if the validation is in one place.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's easy to check the invariant in the default impl, and relying on validate_byte_slice is prone to error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a separate error type with the check in parse_byte_slice because:

  1. It's not possible to verify that validate_byte_slice performs this check (debug_assert only works if there is a code path)
  2. We can perform the check automatically, so it seems like we should
  3. It's easy to miss the check in code review

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but "it's not possible to verify X" is the standard state of all unsafe code. My experience with Rust unsafe is that it's far better to consolidate the unsafe requirements in one well-documented place. This is basically splitting out the responsibilities and making it trickier to reason about: now validate_byte_slice is in charge of validating the slice only in some ways.

"We can perform this check so we should" is why I'm suggesting debug asserts, we should absolutely be sprinkling the code with more of these.

I think it's table stakes for Rust unsafe code review for folks to be verifying that the function is defined according to the checklist, if reviewers aren't doing that that's a much bigger problem.

Perhaps we should more strongly require comments that ensure the checklists are satisfied. We require safety explainer comments, but in the case of unsafe traits, we could require checklists and require that these checklists be proven for each impl.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'm convinced. I changed it to a bullet point in the checklist.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out that there is another problem here. What error type should PlainOldULE and and CharULE return when the lengths are not correct? In my previous draft, I had them returning UleError (as the default), but now that I've gotten rid of that, I don't know what error they should return. In an effort to not block this otherwise productive PR with documentation changes, I filed a follow-up issue for this: #1144

Copy link
Member

@Manishearth Manishearth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the new safety docs, but I'm not convinced about moving the length check outside of validate_byte_slice(). Can we document it more strongly that it needs to be a part of validation?

utils/zerovec/src/ule/mod.rs Outdated Show resolved Hide resolved
utils/zerovec/src/ule/mod.rs Outdated Show resolved Hide resolved
Self::validate_byte_slice(bytes).map_err(|e| UleError::InvalidBytes(e))?;
if bytes.len() % mem::size_of::<Self>() != 0 {
return Err(UleError::InvalidLength);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would imagine that this is the job of validate_byte_slice. We could debug_assert here, but I think it's better if the validation is in one place.

utils/zerovec/src/ule/mod.rs Outdated Show resolved Hide resolved
Check for lengths in ULE and revise safety docs
@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • Cargo.lock is different
  • utils/zerovec/src/ule/mod.rs is different
  • utils/zerovec/src/ule/plain.rs is no longer changed in the branch
  • utils/zerovec/src/ule/string.rs is no longer changed in the branch
  • utils/zerovec/src/ule/vec.rs is no longer changed in the branch
  • utils/zerovec/src/varzerovec/mod.rs is no longer changed in the branch
  • utils/zerovec/src/zerovec/mod.rs is no longer changed in the branch

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

@sffc sffc requested a review from Manishearth October 3, 2021 06:54
@sffc
Copy link
Member Author

sffc commented Oct 3, 2021

I believe I addressed everyone's feedback. I changed back to a single error type. PTAL and feel free to merge.

@sffc sffc mentioned this pull request Oct 4, 2021
@Manishearth Manishearth merged commit 2528fe1 into unicode-org:main Oct 4, 2021
robertbastian pushed a commit to robertbastian/icu4x that referenced this pull request Oct 4, 2021
@sffc sffc deleted the zv-length branch January 11, 2022 00:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants