Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Rename *T to *const T #68

Merged
merged 6 commits into from
Jun 24, 2014
Merged

Conversation

alexcrichton
Copy link
Member

No description provided.

@alexcrichton
Copy link
Member Author

cc rust-lang/rust#7362

@vadimcn
Copy link
Contributor

vadimcn commented May 6, 2014

-1: I think that internal language consistency is more important than compatibility with C.
If people are misusing &*, maybe a lint pass is in order?

@dobkeratops
Copy link

is this helpful (or even essential?) for aliasing rules. i think i was told you dont need 'restrict' because safe code does that job (non-aliased mutable pointers).

i think it would be confusing to flip mut->const, but then again, unsafe code is going to require special attention anyway. ( i had originally asked if you needed the opposite of 'restrict' or something like that, but this might be a solution )

@thestinger
Copy link

@dobkeratops: This isn't related to restrict. It's true that &mut has the same implications as restrict, but it's not possible to use it in the same places as a raw pointer because it needs a valid lifetime and is non-nullable.

@dobkeratops
Copy link

ok so either way the rule would be unsafe pointers are assumed to potentially alias , i guess (be it mut/non mut or non const/const).
I think I was told that you woulnd't need restrict because you could just return an &mut from unsafe (or make an &mut in unsafe?)

@thestinger
Copy link

In order to return &mut T, you need a valid lifetime to tag it with and you need to know that it's non-null. It's not (currently) correct to assume that Option<&mut T> has the same ABI as *mut T, and even then you need a lifetime to tag it with (we don't have a raw / nil lifetime).

@huonw
Copy link
Member

huonw commented May 6, 2014

I would prefer to have &T*const T and &mut T*mut T to make it absolutely crystal clear what corresponds to what (just dropping the *T type entirely, preferably with an error "you need to choose either *const T or *mut T").

maybe a lint pass is in order?

I don't think it is possible to lint extern definitions in a sane way.

The type `&mut T` will automatically coerce to `*T` in the normal locations that
coercion occurs today. It will also be possible to explicitly cast with an `as`
expression. Additionally, the `&T` type will automatically coerce to `*const T`.
Note that `&mut T` will not automatically coerce to `*const T`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some motivation for this? &mut T coerces to &T.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly just to remain the same as today's semantics with *T and *mut T. It doesn't seem like much could go wrong, but too many coercions may push us in the direction of implicit coercions between these two unsafe pointers, which I'd be a little worried about.

I don't have a huge preference one way or another on this (I'm fine amending it into the RFC)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think &mut should coerce to *const, and both conversions should be considered to consume the &mut if they don't already.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consumption may not work as intended, I believe you can continue to create &mut T pointers via &mut *expr (reborrowing).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consumption may not work as intended, I believe you can continue to create &mut T pointers via &mut *expr (reborrowing).

If so, it still requires explicit thought & some actual code, rather than it happening automatically/implicitly.

@telotortium
Copy link

If it turns out that trying to reconcile Rust and C semantics on unsafe
pointers is too hard, perhaps you could add attributes to the pointer types to
indicate const-volatile qualifiers. For instance, the declaration of memcpy
could be:

#[no_mangle] extern fn memcpy(dst: #[cvqual(restrict)] *c_void,
                              src: #[cvqual(const, restrict)] *c_void,
                              n: size_t) -> *c_void;

Of course, the compiler would have to learn how to handle the cvqual
arguments (any of const, volatile, and restrict). Also, there's some
other issues:

  • The equivalent C declaration of memcpy is

    void * memcpy(void *restrict dst, const void *restrict src, size_t n);

    Why does restrict come after * here? Is there a semantic
    implication?

  • What safety in non-FFI code does *T give over *mut T? Would it be
    possible to get rid of *mut T altogether?

@bharrisau
Copy link

"you need to choose either *const T or *mut T"

This feels a better way to go than changing what *T currently means.

@anasazi
Copy link

anasazi commented May 6, 2014

I think I prefer the *const T & *mut T idea.
The current situation (* T & *mut T) is confusing coming from C, but the proposed alternative (*const T & * T) may just move that confusion to those coming from Rust since it's flipped from all the other pointers.
Removing * T altogether and annotating both cases seems like the clearest option.

@SiegeLord
Copy link

I don't see the evidence for the 'confusion' aspect. The vast majority of FFI code is (or should be) generated by rust-bindgen, which gets the *T/*mut T right out of the box. The fact that it isn't in Rust's bundled FFI files is a bug with the relevant libraries (the *T/*mut T misuse is merely a consequence of writing that code wrong... there maybe be many other mistakes that are not caught). In the future, we'll hopefully move to a procedural macro to bind C headers automatically preventing any FFI code from being explicitly written. Overall, I think it's a solution for a problem (if it exists) that will soon be gone. I don't think FFI should be mentioned or considered in this proposal.

As for the rest of Rust, I think the symmetry with all the other references explains how it's meant to be used. Notably, there's no consideration for the confusion between C++ references and our borrowed pointers which also have a flipped mutability.

@huonw
Copy link
Member

huonw commented May 6, 2014

@SiegeLord that approach is actually mentioned under "Alternatives". :)

@SiegeLord
Copy link

I missed that, although I disagree that it's an 'alternative'. rust-bindgen and a procedural macro/compiled hook will be in wide use regardless of whether or not this RFC is accepted. Hand-writing FFI is error prone and fragile, and changing how the pointers look will not change that.

@bstrie
Copy link
Contributor

bstrie commented May 6, 2014

Without commenting on the RFC directly, @SiegeLord , I don't know how realistic it is to expect that we will ever arrive at a point where tools will obviate the need to write FFI code. Even today, there are plenty of people willingly avoiding rust-bindgen in favor of rolling their own wrappers by hand.

If this RFC is accepted, count me among those who would prefer to have both *const T and *mut T for maximum explicitness.

@anasazi
Copy link

anasazi commented May 6, 2014

@bstrie my thoughts exactly. It'd be great if we could automate it all away, but I think that's unlikely.

@thestinger
Copy link

We do need to get to the point where hard-wiring an ABI by hand is not required. Libraries often change the ABI between versions and it's not uncommon for an ABI to vary between platforms or different configurations of the library. At the moment, these changes will lead to silent memory corruption issues or worse.

Since the remaining uses of raw pointers are native to Rust rather than for C wrappers, I think @SiegeLord's argument is a good one.

@SiegeLord
Copy link

Even today, there are plenty of people willingly avoiding rust-bindgen in favor of rolling their own wrappers by hand.

Assuming you've meant bindings (stuff that goes inside extern "C" blocks) and not wrappers (anything higher level which cannot be generated without reading the documentation), I'd like to know their reasons (aside from the clang dependency). While rust-bindgen output is not perfect (yet), it makes an excellent starting point for FFI creation. These days it's so good that the only things I change is formatting (because I don't yet generate bindings as part of the build process) and introducing size_t and c_bool. All those are a due entirely to my laziness and not a fault of rust-bindgen. Either way, I specifically don't have to worry about getting *T vs *mut T right.

@huonw
Copy link
Member

huonw commented May 7, 2014

BTW, rust-lang/rust#2124 is the issue for autogenerated bindings.

@bstrie
Copy link
Contributor

bstrie commented May 7, 2014

@SiegeLord , I'm not the one you need to convince. :) I use bindgen, I love bindgen. It seems to be mostly graphics people who don't trust bindgen to not muck things up, but I'm not going to pretend that I have tons of evidence here.

@thestinger , I thought you were campaigning for this change a while ago, before the RFC process existed. Have you reversed your position? Because if you don't think it's an issue, then I'll trust your judgment.

@thestinger
Copy link

@bstrie: I supported changing this before, but I'm less sure now. I don't really like the idea of making changes in unsafe Rust aimed at bringing it closer to C. I think most of the problem is that the existing FFI bindings in the Rust repository are quite ancient and spread bad habits. Since the rest of Rust uses immutable by default and opt-in mutability, I do think it would be better to stay consistent with raw pointers... but I don't know if we're going to be able to get people to stop abusing poor *T.

I think part of the problem is that &mut only coerces to *mut T, and functions not performing mutation are the most common. This leads people into using *T for a mutable pointer because most of the time it's not being mutated. I'd like to have *mut T -> *T coercion just as C++ does, and just as we have with &mut T -> &T. It's not dangerous because the aliasing guarantees on &mut T don't apply to *mut T.

@bill-myers
Copy link

What about replacing the *T syntax with Ptr<T> just like ~T is being replaced with Box<T>?

@thestinger
Copy link

@bill-myers: What would be the name for the mutable one?

@nathanaschbacher
Copy link

@bill-myers I actually sorta like that.

Maybe there's some consistency to be gained by following an axiom. Sorta the Smalltalk/Ruby way where "everything is an object". In Rust it could be "everything is an explicit named type" which is essentially true already, but just not expressed consistently. Some things are explicit like Rc<T> and Box<T> and other things are still sugar like *T

Part of me likes the sugar, because it's short-hand, but so much of the sugar has been removed/changed (@, ~, possibly now &mut) that my brain requires holding all the sugar plus all the things that don't have sugar and their more verbose implementations.

The most painful part of development for me is context switching. Following one convention, axiom consistently would make that easier and less all over the map like C++ is.

@thestinger, MutPtr<T> ?

@alexcrichton
Copy link
Member Author

I have updated this with *mut and *const. The only change this RFC is now proposing is to rename *T to *const T

@bharrisau
Copy link

How does this all relate to the issues around "&mut should be &only"? Are we really saying *const is constant? Or is this another unique pointer/aliasing thing? (I have't bothered to understand it 100%, so sorry if I've written it in a confusing way)

@pnkfelix
Copy link
Member

@bharrisau I would say that this is a aliasing thing, though maybe not in the way you expected.

We are not saying *const is constant. A const-qualified pointer in C is not necessarily constant: it merely cannot be assigned to, but it can still be aliased by mutable pointers, and thus is not constant in general.

See e.g. http://yarchive.net/comp/const.html

This is a crucial distinction between a const-qualified pointer T const* in C versus a &T in Rust: in Rust, the &T is not only unassignable, but you also are guaranteed that no other code has write-access to it (i.e. there are no writable aliases; only fellow readers).

To my mind, the advantage of using *const T in Rust is to make it clear that one needs to apply the rules for const-qualification from C when reasoning about the pointers.

@LeoTestard
Copy link
Contributor

I'm a bit worried about this RFC.
While I can see the benefits of having this, I don't think it's worth it. First this requires us to add a keyword, and complexity in the language, and breaks the coherence with the rest of the language, where we always supported implicit immutability.
The advantage of doing this is to get closer to the semantics of C. The semantics of C regarding mutability and aliasing are complex, and we will never get close enough to prevent all FFI bugs except by reimplementing a C compiler. Furthermore, I'm not sure compatibility with C is worth integrating some flaws of C into Rust, adding complexity and loosing coherence.

@alexcrichton
Copy link
Member Author

I am wary of dealing with Ptr<T> or some related structure because it looks very odd in FFI definitions. It would be one of the only pointer types in rust passed by value through a struct in FFI, which seems off. It is something, however, that we may be able to get used to in the future.

I currently only know of *restrict T which may need to be added (*volatile T isn't necessary I believe). I would expect that if we went with a wrapper struct of Ptr<T> then we would not go for a composition of Restrict<Unsafe<T>> but rather Restrict<T> on its own (or something like that).

You're right though, foo as &T as Unsafe<T> does seem quite odd, but we could possibly remove casts or add a to_ptr() method to &T so the as syntax wouldn't be necessary.

All-in-all, I personally feel like first-class language types are still necessary. These are the absolute core building blocks of all other primitives, which seems like they should have first-class support (even though they shouldn't be widely used).

@arcto
Copy link

arcto commented Jun 6, 2014

It has to be more important that Rust is consistent than similar to C. The idea of being a systems programming language also means dealing with hardware by itself, not relying on a 40+ year old language. FFI to C is only one use case for raw pointers.

@darnuria
Copy link

darnuria commented Jun 6, 2014

+1 @LeoTestard, I think in my humble experience of Rust. It's not a good idea to design rust with C in mind. C is well know for its complexity...

Also the actual syntax is really good for reading code, it's clear and efficient.

And to repond to @pnkfelix

To my mind, the advantage of using *const T in Rust is to make it clear that one needs to apply the rules for const-qualification from C when reasoning about the pointers.

Yes but it add all of the overhead in C about const value, const pointer, const pointer of pointer...
Also if we do this we will do the same for restrict pointer, volatile ptr and so on.

@pnkfelix
Copy link
Member

pnkfelix commented Jun 6, 2014

Sigh.

Its not out of love for C that I contend that *const T and *mut T is a better choice of syntax here. Its simply that I think some difference in syntax is warranted, given that these pointers obey a different set of rules than that of &T and &mut T.

For example, If I thought that *readonly T and *mut T would fly, then I would suggest that instead.

@darnuria I do not know what this:

Yes but it add all of the overhead in C about const value, const pointer, const pointer of pointer...

is trying to say. Are you referring to syntactic overhead? Or are you talking about the fact that in C the declarations const int x and int const* x and int* const* x are distinct things? Can you provide a more specific example of what it is you fear, rather than a list of vague references to C keywords?

@darnuria
Copy link

darnuria commented Jun 7, 2014

@pnkfelix: Sorry after reflection I was not clear.

Yes but it add all of the overhead in C about const value, const pointer, const pointer of pointer...

By this statement I expressed the fact that C had a very different idea behind the const keyword.

// 1. In some way you have immutability to a value like in:
const char ch = 'c';
// 2. Imutability to a pointer like:
const char *str = "Toto";
// 3. Or a pointer to a pointer to const char:
const char **ptr_to_str = str;
// etc...

And in this case 1. and are OK because it's not difficult to reason about them.

But in the 3. case it's not clear to reason about it...

Other point:
And a totally other concept is that the proportion of immutable code is very important in rust project, (I have not the percentage) to add an other keyword.

Disclaimer:
I talk as a Rust user not as language developer so I can miss some use cases and be wrong. :)

The current difference in Rust unsafe pointers types with C pointers types is
proving to be too error prone to realistically enable these optimizations at a
future date. By renaming Rust's unsafe pointers to closely match their C
brethren, the likelihood for errneously transcribing a signature is diminished.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: s/errneously/erroneously/

@pnkfelix
Copy link
Member

pnkfelix commented Jun 7, 2014

@darnuria Your example translates directly to types one can describe already today in Rust, like so:

fn main() {
    #![allow(unused_mut)] // disable warnings to allow faithful transcription.
    let     ch         :       u8 = 'c' as u8;
    let mut str1       :      *u8 = "Toto".as_bytes().as_ptr();
    let mut ptr_to_str : *mut *u8 = &mut str1 as *mut _;

    unsafe {
        printout(str1, ptr_to_str);
        // **ptr_to_str = ch; // This would error with "cannot assign to ..."
        // *str1 = ch;        // and this would as well.
        let mut but = ['U' as u8, 'h' as u8, 'O' as u8, 'h' as u8];
        *ptr_to_str = but.as_ptr();
        printout(str1, ptr_to_str);
        but[0] = ch;        // Accepted; `*str1` changes (`str1` does not, of
        but[1] = '?' as u8; // course).`str1` is just like a `*const char` in C.
        printout(str1, ptr_to_str);
    }

    fn printout(str1 : *u8, ptr_to_str: *mut *u8) {
        unsafe {
            println!("Hello str1[0, 1] {} str1: {:014} ptr_to_str: {}",
                     [*str1 as char, *str1.offset(1) as char].as_slice(),
                     str1,
                     ptr_to_str);
        }
    }
}

What does this print? Depending on what addresses it allocates, it prints something like this:

% rustc /tmp/foo.rs && ./foo
Hello str1[0, 1] [T, o] str1: 0x00010bfde040 ptr_to_str: 0x7fff53cba590
Hello str1[0, 1] [U, h] str1: 0x7fff53cba564 ptr_to_str: 0x7fff53cba590
Hello str1[0, 1] [c, ?] str1: 0x7fff53cba564 ptr_to_str: 0x7fff53cba590

If you think it is hard to reason about the type of ptr_to_str in the C code, why do you think it is any easier to do so for Rust?

My claim is that reasoning about the two cases is the same. We already support the notion of a const pointer as provided by C/C++. The change being suggested in this RFC has nothing to do whether that notion is supported. It is just about the syntax used to provide it.

@glaebhoerl
Copy link
Contributor

I would suggest that if the best option on the merits is not obvious, backwards compatibility can be a tie breaker. In that respect, Ptr/MutPtr/etc. are far less burdensome to commit to supporting forever than dedicated syntax would be. If we start off without dedicated syntax, and later on decide that we really want it after all, we can add it back at any point. The reverse is not true.

@alexcrichton alexcrichton changed the title RFC: Remove *mut T, add *const T RFC: Rename *T to *const T Jun 10, 2014
@alexcrichton
Copy link
Member Author

ping @nikomatsakis, I updated with our discussion on Friday about when it is safe to coerce. I also added a new unresolved question about applying temporary lifetimes to coercions.


When coercing from `&'a mut T` to `*mut T`, Rust will guarantee that the memory
will stay valid during `'a` and that the memory will *not be accessed* during
`'a`. Additionally, Rust will *consume* the `&'a mut T` during the coercion. It
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caveat: I am not sure if we currently consume during such a coercion. we should test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we don't, this code is accepted today:

extern {
    fn bar(a: *mut int, b: *mut int);
}

unsafe fn foo(a: &mut int) {
    bar(a, a)
}

@brson brson merged commit 2d71209 into rust-lang:master Jun 24, 2014
@brson
Copy link
Contributor

brson commented Jun 24, 2014

Accepted as RFC 32.

bors added a commit to rust-lang/rust that referenced this pull request Jun 24, 2014
This does not yet change the compiler and libraries from `*T` to `*const T` as
it will require a snapshot to do so.

cc #7362

---

Note that the corresponding RFC, rust-lang/rfcs#68, has not yet been accepted. It was [discussed at the last meeting](https://github.com/rust-lang/rust/wiki/Meeting-weekly-2014-06-10#rfc-pr-68-unsafe-pointers-rename-t-to-const-t) and decided to be accepted, however. I figured I'd get started on the preliminary work for the RFC that will be required regardless.
steveklabnik pushed a commit to rust-lang/rust.vim that referenced this pull request Jan 29, 2015
This does not yet change the compiler and libraries from `*T` to `*const T` as
it will require a snapshot to do so.

cc #7362

---

Note that the corresponding RFC, rust-lang/rfcs#68, has not yet been accepted. It was [discussed at the last meeting](https://github.com/rust-lang/rust/wiki/Meeting-weekly-2014-06-10#rfc-pr-68-unsafe-pointers-rename-t-to-const-t) and decided to be accepted, however. I figured I'd get started on the preliminary work for the RFC that will be required regardless.
withoutboats pushed a commit to withoutboats/rfcs that referenced this pull request Jan 15, 2017
clarification about Stream::into_future()
tilde-engineering pushed a commit to tildeio/helix that referenced this pull request Apr 8, 2017
@Centril Centril added A-syntax Syntax related proposals & ideas A-raw-pointers Proposals relating to raw pointers. labels Nov 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-raw-pointers Proposals relating to raw pointers. A-syntax Syntax related proposals & ideas
Projects
None yet
Development

Successfully merging this pull request may close these issues.