Skip to content

Use less CString in the examples of CStr. #136187

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 26, 2025
Merged

Conversation

hkBst
Copy link
Member

@hkBst hkBst commented Jan 28, 2025

Fixes #83999

@rustbot
Copy link
Collaborator

rustbot commented Jan 28, 2025

r? @scottmcm

rustbot has assigned @scottmcm.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jan 28, 2025
@hkBst hkBst changed the title test Do not use CString in the examples of CStr. Jan 28, 2025
@rust-log-analyzer

This comment has been minimized.

@hkBst hkBst marked this pull request as ready for review January 28, 2025 18:08
@rust-log-analyzer

This comment has been minimized.

Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are two minor changes that are fine but the gist of the issue was simply incorrect when it comes to most cases.

/// // Do not do this:
/// let ptr = CString::new("Hello").expect("CString::new failed").as_ptr();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we called as_ptr() on CString, but it autoderefs to CStr when looking up where to call. The current example does in fact illustrate usage of CStr::as_ptr, and the issue was mistaken. The review from #84000 seems to apply in full to this example.

Comment on lines +58 to +66
/// use std::ffi::CStr;
/// use std::os::raw::c_char;
///
/// fn work(data: &CStr) {
/// # /* Extern functions are awkward in doc comments - fake it instead
/// extern "C" { fn work_with(data: *const c_char); }
/// # */ unsafe extern "C" fn work_with(s: *const c_char) {}
///
/// unsafe extern "C" fn work_with(s: *const c_char) {}
/// unsafe { work_with(data.as_ptr()) }
/// }
///
/// let s = CString::new("data data data data").expect("CString::new failed");
/// let s = c"Hello world!";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as is this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're here, do you have any idea about "/* Extern functions are awkward in doc comments - fake it instead"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just a comment about how linking functions gets weird in rustdoc tests.

Comment on lines 479 to 499
/// let c_str = std::ffi::CStr::from_bytes_with_nul(
/// &"Hello world!"
/// .to_uppercase()
/// .chars()
/// .map(|c| c.try_into().unwrap())
/// .chain(std::iter::once(0))
/// .collect::<Vec<u8>>(),
/// )
/// .unwrap();
/// // above CStr is now bound and will not be dropped immediately
/// let ptr = c_str.as_ptr();
///
/// assert_eq!(
/// (0..)
/// .map_while(|i| unsafe {
/// let p = ptr.add(i);
/// (*p != 0).then_some(*p as u8 as char)
/// })
/// .collect::<String>(),
/// "Hello world!".to_uppercase()
/// );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as #84000 said, the code pattern we want to warn against is specifically given and replacing the example with a more contrived one does not improve it. I don't think we want to accept this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CString::new("Hello").expect("CString::new failed") is also contrived, since there is no point in doing that with a literal string when you can just use c"Hello", IIANM, but perhaps this tries too hard to avoid using CString...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something like this:

use std::ffi::{CStr, CString};

fn main() {
    // creates a dangling pointer to a CString temporary!
    let ptr = CString::new("Hi!".to_uppercase()).unwrap().as_ptr();

    // fails, showing ptr is now garbage
    assert_eq!(unsafe { CStr::from_ptr(ptr) }, c"HI!");
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sometimes it won't actually show garbage depending on the optimization level

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's true. And since this is UB anything might happen really. Current proposal says this:

    /// let ptr = CString::new("Hi!".to_uppercase()).unwrap().as_ptr();
    /// // temporary CString is dropped here...
    ///
    /// // because of UB, anything might happen...,
    /// // but most likely this fails, showing ptr is now garbage,
    /// assert_eq!(unsafe { CStr::from_ptr(ptr) }, c"HI!");

but I can add a note about "mostly depending on optimizations" or some such...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was sort of hoping you'd ask at some point what changes would be desirable? It's awkward for me to just ask that question and answer it on my own, after all.

I guess I was not asking exactly that, because I thought I had a pretty good idea of what was desirable here, although I did try to recognize flaws "perhaps this tries too hard to avoid using CString..." and ask for clarification both based on a suggestion "How about something like this: ..." and open-ended "I'm open to suggestions". While not exactly the same, I'm not quite sure why you did not feel that this provided you the same opening.

But most of all I would rather things not feel like an argument where it doesn't feel like they need to, because I feel like things do actually need to become an argument way too often already. I'm certainly not a stranger to them, but an argument is still an enormously costly event for everyone involved.

I completely agree. I'd much rather have a discussion than an argument, which is what I thought we were having (a discussion), until it seemed like you thought you were in an argument. I'm really not sure what happened there, and I'd really like to know if you care to explain. But I'm also happy to wipe the slate; to forgive and forget. Anything to get back to a productive discussion, because the cost of an argument isn't worth it.

Of course, clicking the "Close pull request" button is a fairly far step for many reviewers. A lot would rather simply leave you there, languishing in their queue forever. It's not because they enjoy you twisting in the wind, but it can take a lot of energy to give you the decency of a "No", especially if they feel they have to provide a reason for it and want to, say, try to avoid hurting anyone's feelings unnecessarily?

I'm certainly aware that it can be difficult to say "no" while avoiding hurt feelings, and that this can complicate the job of reviewing PRs (to say the least). I certainly have some other open PRs where it is difficult to find the path of progress, although in all cases so far, I've found that I've been able to make improvements and learn something, even if the "no" seemed wrong at first.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my current thinking is that instead of an assert a dbg or println would have the same benefit, without the downside of being wrong. Please let me know what you think, even if I'm on the wrong track entirely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be fine, actually.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And yeah, I'm willing to write this off as a brief confusion, either way, I just had to express this was all a bit perplexing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perplexing on both sides :D, but I'm happy we're making progress again.

I've pushed a new version without assert that would fail, instead now using dbg (since CStr doesn't impl Display anyway). I've also beefed up the explanation/warning of UB. I hope you like it, but if you still think the original is better, that's fine too.

@hkBst hkBst requested a review from workingjubilee February 5, 2025 13:29
@hkBst hkBst changed the title Do not use CString in the examples of CStr. Use less CString in the examples of CStr. Feb 5, 2025
Comment on lines 452 to 453
/// // nothing about its behavior is guaranteed, although,
/// // it will likely resemble the code as written
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only part I still have issue with: the main problem with UB is that people misunderstand this detail. The compiler is given large latitude in terms of what it decides a program means when undefined behavior is involved, so it's more of a "even though"... especially "even if you get a certain described result in one example, you can't rely on this consistently operating a certain way when combined with any other code".

A somewhat infamous miscompilation result that can happen is that by reaching UB, LLVM may notice and delete code, including the parts that would initiate a transfer of control... so the result is fall-through to whatever machine code was ordered next in the binary!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I suppose I should ask, what did you have in mind with this phrasing change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rereading my old version it did not seem to make it clear enough that the UB could affect the whole program, instead of just the line that tries to print the dangling CStr. That's why I moved the comment about UB making it do anything from the printing line to the top. Together with:

    /// // most likely the program behaved as expected so far
    /// // and this just shows `ptr` is now garbage
    /// dbg!(unsafe { CStr::from_ptr(ptr) });

it is now clearer that the whole program may be affected.

So it looks like we are on the same page again (namely people misunderstanding the possible impact of UB, and I was one of them when I wrote the previous text), but from your question I must conclude that the writing is not as clear as I had hoped. Any ideas for how to make it better?

Copy link
Member

@workingjubilee workingjubilee Feb 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is linted on right now: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=89d0c8ea75267098b5600d7b69376b0e

warning: a dangling pointer will be produced because the temporary `CString` will be dropped
 --> src/main.rs:5:55
  |
5 | let ptr = CString::new("Hi!".to_uppercase()).unwrap().as_ptr();
  |           ------------------------------------------- ^^^^^^ this pointer will immediately be invalid
  |           |
  |           this `CString` is deallocated at the end of the statement, bind it to a variable to extend its lifetime
  |
  = note: pointers do not have a lifetime; when calling `as_ptr` the `CString` will be deallocated at the end of the statement because nothing is referencing it as far as the type system is concerned
  = help: you must make sure that the variable you bind the `CString` to lives at least as long as the pointer returned by the call to `as_ptr`
  = help: in particular, if this pointer is returned from the current function, binding the `CString` inside the function will not suffice
  = help: for more information, see <https://doc.rust-lang.org/reference/destructors.html>
  = note: `#[warn(dangling_pointers_from_temporaries)]` on by default

Maybe it is more helpful to reference the lint and explain it further?

Otherwise, to rephrase the current lines, I would consider saying something like, "this entire program experiences UB, even if it behaved as expected up to this point. the following may show "HI!", but usually it's junk instead".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in general and I would stress "even if" instead of "although"... it's always a "yes and" with UB. "yes it can actually perform correctly right now, at this point, and then infest your computer with malware later in the program's runtime, even though the UB is here".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another attempt, with skull emojis... Apologies if this is too much...

Copy link
Member

@workingjubilee workingjubilee Feb 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think just the prose as-is good now, but I'm happy with the emojis if you like them. If someone finds them intolerable I'm sure we'll hear about it. :^)

Copy link
Member Author

@hkBst hkBst Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure it is too much, but I'll leave it like this for now.

@workingjubilee
Copy link
Member

@bors delegate+

@bors
Copy link
Collaborator

bors commented Feb 22, 2025

✌️ @hkBst, you can now approve this pull request!

If @workingjubilee told you to "r=me" after making some further change, please make that change, then do @bors r=@workingjubilee

@hkBst
Copy link
Member Author

hkBst commented Feb 25, 2025

@bors r=@workingjubilee

@bors
Copy link
Collaborator

bors commented Feb 25, 2025

📌 Commit fc02cfd has been approved by workingjubilee

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 25, 2025
@bors bors merged commit f3a445b into rust-lang:master Feb 26, 2025
6 checks passed
@rustbot rustbot added this to the 1.87.0 milestone Feb 26, 2025
@hkBst hkBst deleted the patch-27 branch February 27, 2025 08:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Docs for CStr.as_ptr() are describing CString
6 participants