Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new allocator interface #5064

Merged
merged 2 commits into from
Jun 27, 2020
Merged

new allocator interface #5064

merged 2 commits into from
Jun 27, 2020

Conversation

marler8997
Copy link
Contributor

@marler8997 marler8997 commented Apr 16, 2020

Related Issue: #4431 (Improve Allocator interface so client can avoid needless copying)

Replaces the current allocator interface with the following:

    /// Attempt to allocate at least `len` bytes aligned to `ptr_align`.
    ///
    /// If `len_align` is `0`, then the length returned MUST be exactly `len` bytes,
    /// otherwise, the length must be aligned to `len_align`.
    ///
    /// `len` must be greater than or equal to `len_align` and must be aligned by `len_align`.
    allocFn: fn (self: *Allocator, len: usize, ptr_align: u29, len_align: u29) Error![]u8,

    /// Attempt to expand or shrink memory in place. `buf.len` must equal the most recent
    /// length returned by `allocFn` or `resizeFn`.
    ///
    /// Passing a `new_len` of 0 frees and invalidates the buffer such that it can no
    /// longer be passed to `resizeFn`.
    ///
    /// error.OutOfMemory can only be returned if `new_len` is greater than `buf.len`.
    /// If `buf` cannot be expanded to accomodate `new_len`, then the allocation MUST be
    /// unmodified and error.OutOfMemory MUST be returned.
    ///
    /// If `len_align` is `0`, then the length returned MUST be exactly `len` bytes,
    /// otherwise, the length must be aligned to `len_align`.
    ///
    /// `new_len` must be greater than or equal to `len_align` and must be aligned by `len_align`.
    resizeFn: fn (self: *Allocator, buf: []u8, new_len: usize, len_align: u29) Error!usize,

This new interface makes 2 big changes. The first is that it provides a way for an allocator to report the full capacity of any allocation. This allows clients to take advantage of the entire memory space of any allocation.

The second change with the new interface is that it allows the caller to have more control over resizing and when/how to relocate memory. The previous interface had a reallocFn which would attempt to resize an allocation and would move and copy the entire buffer to a larger space when needed. The new interface allows clients to determine when memory should be moved and what data to preserve when doing so. The realloc implementation is now common between all allocators using a combination of function calls from the new interface.

@marler8997 marler8997 force-pushed the newAllocator branch 4 times, most recently from a81d7f6 to f65aa38 Compare April 16, 2020 07:48
@daurnimator daurnimator added the standard library This issue involves writing Zig code for the standard library. label Apr 16, 2020
@marler8997 marler8997 force-pushed the newAllocator branch 3 times, most recently from e48a530 to edf9ab7 Compare April 17, 2020 07:03
@fengb
Copy link
Contributor

fengb commented Apr 17, 2020

If you can get this to work, I'm happy to redo my work for #4739

This change will make everything cleaner.

@marler8997
Copy link
Contributor Author

@fengb yes I think I'll have it working soon.

@marler8997 marler8997 force-pushed the newAllocator branch 10 times, most recently from 2b9cd79 to 0183e67 Compare April 17, 2020 23:11
@marler8997 marler8997 marked this pull request as ready for review April 18, 2020 00:11
lib/std/mem.zig Outdated Show resolved Hide resolved
lib/std/mem.zig Outdated Show resolved Hide resolved
@marler8997 marler8997 force-pushed the newAllocator branch 11 times, most recently from a9a2831 to c250aea Compare June 26, 2020 04:47
@marler8997 marler8997 marked this pull request as ready for review June 26, 2020 04:47
@marler8997 marler8997 force-pushed the newAllocator branch 2 times, most recently from 83a47ed to e1b7dfe Compare June 26, 2020 14:27
Copy link
Member

@andrewrk andrewrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice work @marler8997. Thank you for your patience on what turned out to be a big project. I think this is very close to merge-ready.

@fengb and/or @kubkon would you mind providing a review on the changes to WasmPageAllocator?

lib/std/heap.zig Show resolved Hide resolved
lib/std/heap.zig Outdated Show resolved Hide resolved
lib/std/heap.zig Outdated Show resolved Hide resolved
}
const full_len = init: {
if (comptime supports_malloc_size) {
const s = malloc_size(ptr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you double check that this does not cause false positives with external tools such as valgrind? I'm concerned that valgrind would detect utilization of these extra bytes as writes to invalid memory addresses.

Copy link
Contributor Author

@marler8997 marler8997 Jun 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TLDR; looks like valgrind shims malloc_usable_size to return the exact size originally requested from malloc.

Ok I verified valgrind handles this (at least on my linux box). Here's the program:

// zig build-exe leak.zig -lc
const std = @import("std");

pub fn main() !void {
    const allocator = std.heap.c_allocator;

    var i : usize = 1;
    while (i <= 100) : (i += 1) {
        const m = try allocator.callAllocFn(i, 1, 1);
        defer _ = allocator.callResizeFn(m, 0, 0) catch unreachable;
        std.debug.warn("alloc {} returned {}\n", .{i, m.len});
        //@memset(m.ptr, 0xbb, m.len + 1); // use this to cause an error
        @memset(m.ptr, 0xbb, m.len);
    }
}

Here we use the C allocator to make allocations, and utilize the full size (will call malloc_usable_size under the hood). When I run the binary by itself, I can see it usually returns more memory than I requested. However, when I run it with valgrind, it always returns the exact size requested. This is telling me that valgrind is shimming malloc_usable_size and causing it to always return the size originally requested in malloc. Kinda interesting.

I also verified that writing 1 byte over the size will cause valgrind to detect memory corruption (uncomment the bad @memset call).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for testing this out! This makes me feel comfortable with using malloc_usable_size.

lib/std/heap.zig Outdated Show resolved Hide resolved
lib/std/mem.zig Outdated Show resolved Hide resolved
@@ -2190,6 +2377,13 @@ test "alignForward" {
testing.expect(alignForward(17, 8) == 24);
}

pub fn alignBackwardAnyAlign(i: usize, alignment: usize) usize {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add doc comments to explain why this is needed rather than just using alignBackward?

Looks like it removes the restriction " /// The alignment must be a power of 2 and greater than 0. " ?

Is this necessary at the callsites? Or would it be appropriate to add that restriction to the callsites as well, and have them call alignBackward directly?

Copy link
Contributor Author

@marler8997 marler8997 Jun 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is necessary because allocating arrays require passing the element size for len_align which don't have to be powers of 2. When I was using alignBackwards originally, there were cases where array element sizes were not a power of 2 (i.e. I saw 56). I will add a doc comment explaining how the function differs from alignBackward.

lib/std/heap.zig Show resolved Hide resolved
@@ -11,7 +11,7 @@ pub var allocator_instance = LeakCountAllocator.init(&base_allocator_instance.al
pub const failing_allocator = &failing_allocator_instance.allocator;
pub var failing_allocator_instance = FailingAllocator.init(&base_allocator_instance.allocator, 0);

pub var base_allocator_instance = std.heap.ThreadSafeFixedBufferAllocator.init(allocator_mem[0..]);
pub var base_allocator_instance = std.mem.sanityWrap(std.heap.ThreadSafeFixedBufferAllocator.init(allocator_mem[0..]));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc comments for the allocator say

/// Detects and asserts if the std.mem.Allocator interface is violated

Which looks to me like it's for testing implementations of the Allocator interface, rather than testing the usage of the Allocator high level API functions. In this case I don't think it makes sense to put the wrapper here, since for testing purposes we are not trying to test if the LeakCountAllocator code is correct, we are trying to test if the user code that is using the Allocator provided by the LeakCountAllocator is correct.

Copy link
Contributor Author

@marler8997 marler8997 Jun 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ValidationAllocator validates both the usage of the allocator and the implementation.

i.e. for alloc, ValidationAllocator ensures len > 0, ptr_align is valid, len is aligned by len_align and len >= len_align. Once it returns, it validates that the len complies with the requested len_align and that the pointer is aligned by ptr_align.

Will update ddoc comment with:

/// Detects and asserts if the std.mem.Allocator interface is violated by the caller
/// or the allocator.

lib/std/heap.zig Outdated Show resolved Hide resolved
@marler8997 marler8997 force-pushed the newAllocator branch 4 times, most recently from 4c200cf to 6e93c4e Compare June 27, 2020 06:34
const old_ptr = @ptrCast(*c_void, old_mem.ptr);
const buf = c.realloc(old_ptr, new_size) orelse return old_mem[0..new_size];
return @ptrCast([*]u8, buf)[0..new_size];
fn cResize(self: *Allocator, buf: []u8, new_len: usize, len_align: u29) Allocator.Error!usize {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this get moved to the std.c module as e.g. std.c.allocator?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no. std.c is for libc functions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it for things reliant on libc? (as this is)

const buf = c.realloc(old_ptr, new_size) orelse return error.OutOfMemory;
return @ptrCast([*]u8, buf)[0..new_size];
fn cAlloc(self: *Allocator, len: usize, ptr_align: u29, len_align: u29) Allocator.Error![]u8 {
assert(ptr_align <= @alignOf(c_longdouble));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could use aligned_alloc (C11+) or posix_memalign (older POSIX interface)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separate PR. we tried that before and it depends on libc versions. now that we have the ability to check, it's time to revisit, but that's out of scope of this changeset.

@marnix
Copy link

marnix commented Sep 3, 2020

For the record, this also closed #4431.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants