Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix unnecessary uses of sentinel slices #37

Closed
natecraddock opened this issue Jan 2, 2024 · 6 comments
Closed

Fix unnecessary uses of sentinel slices #37

natecraddock opened this issue Jan 2, 2024 · 6 comments

Comments

@natecraddock
Copy link
Owner

For values returned from Lua, using sentinel slices (specifically [:0]const u8) makes sense. It accurately reflects the data returned, and can be easily used in Zig. For values passed into Lua, this often does not make sense. Most "strings" in Zig aren't null terminated, and it can require an extra allocation to pass a string to Lua this way.

It appears I got a bit excited and used [:0]const u8 far too often for function parameters. These should be audited and changed to []const u8 wherever possible.

@nurpax
Copy link
Contributor

nurpax commented Jan 9, 2024

Noting some examples here:

    /// Checks whether the function argument `arg` is a slice of bytes and returns the slice
    /// See https://www.lua.org/manual/5.1/manual.html#luaL_checklstring
    pub fn checkBytes(lua: *Lua, arg: i32) [:0]const u8 {
        var length: usize = 0;
        const str = c.luaL_checklstring(lua.state, arg, &length);
        // luaL_checklstring never returns null (throws lua error)
        return str[0..length :0];
    }

I guess there's no reason for this to be [:0]? In fact, is this a bug? What if the lua string contains a \0? Will any Zig functions operating on this prematurely truncate the string at first occurrence of 0?

This probably too since we know the length of the string?

    pub fn toString(lua: *Lua, index: i32) ![*:0]const u8 {
        var length: usize = undefined;
        if (c.lua_tolstring(lua.state, index, &length)) |str| return str;
        return error.Fail;
    }

And using c.lua_pushlstring below would avoid [:0]. Well, on a second read I notice pushBytes is the non-sentinel version for this. I wonder if a better naming would be pushString for []const u8 and pushStringZ for [:0]const u8? This is similar to the naming convention used in the Zig std library, e.g., alloc.dupeZ(...) which is the sentinel version of alloc.dupe(...)

    /// Pushes a zero-terminated string onto the stack
    /// Lua makes a copy of the string so `str` may be freed immediately after return
    /// See https://www.lua.org/manual/5.1/manual.html#lua_pushstring
    pub fn pushString(lua: *Lua, str: [:0]const u8) void {
        c.lua_pushstring(lua.state, str.ptr);
    }

@natecraddock
Copy link
Owner Author

Thanks for pointing out these specific issues! I want to prioritize #33 before working on this. This will make any future changes so much easier to make

I wonder if a better naming would be pushString for []const u8 and pushStringZ for [:0]const u8? This is similar to the naming convention used in the Zig std library, e.g., alloc.dupeZ(...) which is the sentinel version of alloc.dupe(...)

I never really loved the bytes/string naming I used, I like this a lot more

@nurpax
Copy link
Contributor

nurpax commented Jan 10, 2024

I think there may be no really hard and fast rule for these. For example, in a lot of case a string literal can be directly passed from Zig down to Lua code, in which case [*:0] or [:0] is the right choice as it just passes the string down without touching it.

@natecraddock
Copy link
Owner Author

After taking a look at this again after a few months, I struggled to find more than only a handful of functions where [*:0] or [:0] were misplaced. I made all of the updates I could think of in #72 including renaming to use pushString and pushStringZ and similar

Re checkBytes

I guess there's no reason for this to be [:0]? In fact, is this a bug? What if the lua string contains a \0? Will any Zig functions operating on this prematurely truncate the string at first occurrence of 0?

I believe this is a non-issue. Returning [:0] signifies that the string is null-terminated. It is possible that there are 0 bytes somewhere in the middle of the string too (per the Lua docs), but in Zig we can just use the .len of the slice to be safe.

But using a :0 means that it is also usually safe to use .ptr on the slice, because all strings returned from Lua will be null terminated. The only time this could go wrong is if there is a 0 somewhere in the string before the true end.

@nurpax
Copy link
Contributor

nurpax commented Mar 20, 2024

I took a look at your change and it looks all great!

Regarding checkBytes: I think you're right. Also, I see that you've removed checkBytes since it's already covered by all those *String functions -- this also looks like the right thing to do.

@natecraddock
Copy link
Owner Author

Thanks for taking a look! I'll go ahead and close this then

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants