Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smaller memory footprint for BoundedArray #16299

Merged
merged 4 commits into from
Jul 3, 2023

Conversation

whatisaphone
Copy link
Contributor

This PR changes BoundedArray's len field from a usize to a minimum-sized integer. If you're keeping a million of these in memory, their size can add up.

BoundedArray(u8, 3) used to take 16 bytes, but now (with len stored as a u2) it only takes 4 bytes.

Notes:

  • len is the only type that changed. I left all the usize parameters alone, to not break existing code. A new test was added, and the old tests were not changed.
    • The test is testing memory layout, which varies across targets, so I only let it run on x86_64. Not sure if that's the right approach?
  • @intCast is sprinkled in. For each one, either the value had already checked before that point, or it's in an "assume" method.
    • appendNTimesAssumeCapacity is an "assume" method, but it had an assert inside. Am I right to think it's odd to mix assumes with asserts? Either way I left the assert as it was.

This PR message is now longer than the diff, oops. I'll stop rambling. Happy reviewing!

Co-authored-by: zooster <r00ster91@proton.me>
Copy link
Member

@andrewrk andrewrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these += @intCast(b) are actually 2 safety checks. This can produce better code if you do this instead:

x = @intCast(x + b);

@andrewrk andrewrk merged commit cb5a6be into ziglang:master Jul 3, 2023
@andrewrk
Copy link
Member

andrewrk commented Jul 3, 2023

Thank you! This is a nice enhancement.

@matklad
Copy link
Contributor

matklad commented Aug 15, 2023

@andrewrk this tripped a pretty bad bug for us in TigerBeetle:

tigerbeetle/tigerbeetle#1121

Essentially, we were doing size_of_something * (array.len + 1) and, with a tight integer bound, that + 1 now can overflow. We have tests to catch this in one instance, but I think there are more, and some are even sleepy bugs, where the size of the array in question is a comptime config value, so might not be tripped in the configuration we are testing with, but might trigger with some other configs. In fact, we were exceptionally lucky (or unlicky) here, as the length of array turned out to be 4 + 1 + 2 which is 2^3 - 1.

I don't think this is a problem with this change per se, as it all is very much reasonable. But it does seem that integer overflow semantics might need some extra work here.

@matklad
Copy link
Contributor

matklad commented Aug 15, 2023

Although I am wondering if it perhaps makes sense to do

len_repr: Len

pub inline fn len(self: BoundedArray) usize {
    return self.len_repr
}

in the meantime, so that the call-sites are more aware that something funky with integrer ranges might be happening here?

jedisct1 added a commit to jedisct1/zig-bounded-array that referenced this pull request Nov 22, 2023
Changing the representation of `len` made it unsafe to add slice
lengths. That silently introduced integer overflows in `hpke`.

Change was introduced in ziglang/zig#16299
andrewrk added a commit that referenced this pull request Aug 24, 2024
This reverts commit cb5a6be.

I deeply apologize for the churn.

This change is problematic given that we do not have ranged integers
(yet? see #3806).

In the meantime, this type needs to be `usize`, matching the length and
index types for all std lib data structures.

Users who want to save memory should not use heap-allocated BoundedArray
values, since it is inherently memory-inefficient. Use a different
memory layout instead.

If #3806 is accepted and implemented, the length value can become an
integer with the appropriate range, without the footgun. If that
proposal is not accepted, len type will remain a usize.
@andrewrk
Copy link
Member

Reverted in 85747b2.

@andrewrk andrewrk added breaking Implementing this issue could cause existing code to no longer compile or have different behavior. release notes This PR should be mentioned in the release notes. standard library This issue involves writing Zig code for the standard library. labels Aug 24, 2024
richerfu pushed a commit to richerfu/zig that referenced this pull request Oct 28, 2024
This reverts commit cb5a6be.

I deeply apologize for the churn.

This change is problematic given that we do not have ranged integers
(yet? see ziglang#3806).

In the meantime, this type needs to be `usize`, matching the length and
index types for all std lib data structures.

Users who want to save memory should not use heap-allocated BoundedArray
values, since it is inherently memory-inefficient. Use a different
memory layout instead.

If ziglang#3806 is accepted and implemented, the length value can become an
integer with the appropriate range, without the footgun. If that
proposal is not accepted, len type will remain a usize.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior. release notes This PR should be mentioned in the release notes. standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants