Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement CompactString::remove() #191

Merged
merged 2 commits into from
Jul 26, 2022
Merged

feat: Implement CompactString::remove() #191

merged 2 commits into from
Jul 26, 2022

Conversation

ParkMyCar
Copy link
Owner

@ParkMyCar ParkMyCar commented Jul 25, 2022

  • add unit tests
  • add doc tests
  • add prop tests
  • add to fuzzing harness

Fixes #141

#[inline]
pub fn remove(&mut self, idx: usize) -> char {
// get the char
let ch = match self[idx..].chars().next() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use char_indices() instead of chars() to omit the len_utf8() call.

Copy link
Contributor

@Kijewski Kijewski Jul 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use let s = self.as_mut_str(); and operate on s throughout the function, so the pointers don't have to be calculated multiple times, and miri understands what you are doing.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how char_indices() would help here? Wouldn't that give us the index of the char we want to remove, which should be the same as the idx argument?

Is there a method we can use with &mut str to do the copying? Maybe you meant let ptr = self.as_mut_ptr()? This prevents the need to calculate the pointer, and fixes Miri :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant something like that:

pub fn remove(&mut self, idx: usize) -> char {
    let s = self.as_mut_str();
    let old_len = s.len();

    let substr = &mut s[idx..];
    let (ch_len, ch) = substr.char_indices()
        .next()
        .expect("cannot remove a char from the end of a string");

    // shift everything back one character
    unsafe {
        core::ptr::copy(
            substr.as_ptr().add(ch_len),
            substr.as_mut_ptr(),
            old_len - (idx + ch_len),
        );
        self.set_len(old_len - ch_len);
    }
    ch
}

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh I see, thanks for providing the example :)

Here ch_len isn't the length of the char returned, I believe it would actually always be 0 because it's the first character of substr.

We run Miri with -Zmiri-strict-provenance and getting two pointers at once seems to violate the stacked borrow rules. Although I like your idea of operating on substr, I refactored the impl to work similar to what you've described.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, you're absolutely right. I confused myself how char_indices() works. :)

@ParkMyCar ParkMyCar force-pushed the feat/remove branch 2 times, most recently from a1ee196 to cd875af Compare July 26, 2022 15:38
* add unit tests
* add doc tests
* add prop tests
* add to fuzzing harness
@@ -204,6 +206,19 @@ impl Action<'_> {
assert_eq!(control.as_str(), compact.as_str());
assert_eq!(compact.capacity(), compact_capacity);
}
Remove(val) => {
let idx = to_index(control, val);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to_index() can also return s.len(). This should have caused some crashes in the fuzzer.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call out! It looks like as it exists today to_index() calls .cycle() before it .chain([s.len()]), which was causing s.len() to never get emitted, since .cycle() never ends. I updated to_index() to fix this and added tests to make sure it's doing what we want

@ParkMyCar ParkMyCar merged commit cfba5be into main Jul 26, 2022
@ParkMyCar ParkMyCar deleted the feat/remove branch July 26, 2022 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: Implement the remove(...) API
2 participants