Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider adding CString/*const i8 literals #400

Closed
rust-highfive opened this issue Oct 15, 2014 · 21 comments
Closed

Consider adding CString/*const i8 literals #400

rust-highfive opened this issue Oct 15, 2014 · 21 comments

Comments

@rust-highfive
Copy link

Issue by canndrew
Wednesday Oct 15, 2014 at 15:59 GMT

For earlier discussion, see rust-lang/rust#18065

This issue was labelled with: B-RFC, I-enhancement in the Rust repository


There's already b"foo" syntax for [u8] literals and I've seen talk of adding s"foo" for String literals (perhaps through a syntax extension). If we're going down that path we could also add c"foo" syntax which could be either a CString or a *const i8. This would be useful for interacting with foreign C code. Currently, to call a C function with a string constant I'm using

foo(b"my string\0".as_ptr() as *const i8)

But that's a little cumbersome (and error-prone if I accidentally leave the NUL off).

@ben0x539
Copy link

You can get close with a macro using concat!

macro_rules! c_str {
    ($s:expr) => { {
        concat!($s, "\0").as_ptr() as *const i8
    } }
}

foo(c_str!("my string"));

Doesn't look too bad imo. A limitation is that you can't initialize statics like that right now. Maybe there's some cast trickery to make that work but I haven't been able to figure that out.

@ghost
Copy link

ghost commented Oct 16, 2014

Why not null-terminate the literals by default? It wouldn't affect Rust code at all.

@sfackler
Copy link
Member

We used to null-terminate strings. It didn't really work, since any slice of a string that doesn't include the end will not be null terminated.

@comex
Copy link

comex commented Oct 23, 2014

That macro would be nice to have in the standard library.

@joshtriplett
Copy link
Member

Rather than adding more magic syntax, why not add a trait similar to Haskell's IsString (used with overloaded string literals)? Then, "string" would mean fromString("string").

@ftxqxd
Copy link
Contributor

ftxqxd commented Oct 28, 2014

@joshtriplett #143 proposed adding overloaded string literals, but was postponed.

@Parakleta
Copy link

I've created a compiler plugin which achieves this, piggy-backing on the binary strings (which must be used in place of regular strings). It's my first attempt at a plugin and I'm pretty much a beginner at Rust too so I may be doing something hideous, but it works for me. It requires libc for the type of c_char. This is currently targeted to the v1.3.0 nightly.

Using the plugin I have confirmed through LLVM-IR that the code compiles to the most obvious traditional C implementation.

The usage syntax is just

#![feature(libc, plugin)]
#![plugin(c_str)]
extern crate libc;
fn main()
{
    unsafe { libc::puts(c!(b"Hello, World!")); }
}

It would be ideal if this could just be available as suggested as a c prefixed string (i.e. c"Hello, World!") instead, but this is a good enough work-around for me for now.

@nagisa
Copy link
Member

nagisa commented Oct 23, 2015

You do not even need a compiler plugin by doing something along the lines of:

macro_rules! c {
    ($str: expr) => { 
        concat!($str, "\0") as *const u8
    }
}

EDIT: I as always didn’t read backlog and there was a similar macro up there already at #400 (comment).

@Parakleta
Copy link

There's a missing .as_ptr() after the concat in your macro, but then the macro fails with "error: cannot concatenate a binary literal". Binary literals matter in C because it allows you to use characters in the 0x80..0xFF range, which you cannot with string literals.

@nagisa
Copy link
Member

nagisa commented Oct 23, 2015

That’s a fair point which would be resolved by #1187.

There's a missing .as_ptr() after the concat in your macro

something along the lines of

@Parakleta
Copy link

Any solution would be fine, but since this issue has languished for nearly 12 months I thought I'd share what I'd been using as a workaround for other people tripped up by it.

@Parakleta
Copy link

It turns out my attempted implementation was naive, and I cannot work out if the problem is hard because I don't know what I'm doing, or if there just isn't any way to do it. I cannot find any way to cast a &[u8] to a &[i8] without using transmute but then transmute is not const so it cannot be used to define static bindings. If I try to bind statically a *const i8 instead of the &[i8] it complains that the trait Sync is not implemented for *const (and additionally that I cannot call as_ptr()).

Forgetting the issue of the null termination which it is apparent is already quite solvable, how do I get a statically bound reference (either &'static or *const) to an array or slice of type libc::c_char in the case that libc::c_char is i8? Or is the only option to always store/handle str or &[u8] types and do the conversion at the last moment with an .as_ptr() as *const libc::c_char as it is handed to a C function?

@Kimundi
Copy link
Member

Kimundi commented Oct 26, 2015

This seems to work:

const P: *const i8 = &b"test\0"[0] as *const u8 as *const i8;

@Parakleta
Copy link

@Kimundi what you have suggested compiles but results in undefined behaviour. It stores just the first character of the string and then takes the address of it and the rest is discarded.

@ref1537 = internal unnamed_addr constant i8 116

@Parakleta
Copy link

Hrmmm... const P: *const i8 = b"test\0" as *const u8 as *const i8; does work however.

@nox
Copy link
Contributor

nox commented Jan 20, 2017

The Erlang NIF loading machinery cannot let Rust allocate a properly NUL-terminated string beforehand, except by leaking memory when the NIF is unloaded. This all comes from the lack of NUL-terminated string literal. erlang/otp#1294 (comment)

@strega-nil
Copy link

@canndrew
Can this be closed? It's easily implementable as a macro in today's rust.

macro_rules! cstr {
  ($s:expr) => (
    concat!($s, "\0") as *const str as *const [c_char] as *const c_char
  );
}

@canndrew
Copy link
Contributor

canndrew commented Jan 21, 2017

I guess so. I'd long forgotten about this ancient bug.

Edit: I can't close it though

@nox
Copy link
Contributor

nox commented Nov 22, 2017

b"my string\0" has the additional constraint that it cannot contain non-ASCII bytes, and we still cannot concat byte literals AFAIK.

@nagisa
Copy link
Member

nagisa commented Nov 22, 2017 via email

@nox
Copy link
Contributor

nox commented Nov 22, 2017

OH, thanks @nagisa I had never realised that wow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests