Skip to content

Conversation

@jercaianu
Copy link
Contributor

Implemented AlignedBlockList, an allocator which holds a list of other allocators and allows for constant time deallocations, using aligned allocations.

Internally, this holds a doubly linked list of other allocators. Each node contains an allocator, followed by its payload. Each node will be allocated by the parent allocator at addresses multiple of 'theAlignment'.
Whenever a deallocation happens, the buffer candidate for deallocation will be rounded to the nearest multiple of 'theAlignment', in order to find in constant time the allocator from which the buffer originated.

This will play a very important part in creating a fast and safe allocator.

Thanks,
Alex

@dlang-bot
Copy link
Contributor

Thanks for your pull request and interest in making D better, @jercaianu! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the annotated coverage diff directly on GitHub with CodeCov's browser extension
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub fetch digger
dub run digger -- build "master + phobos#6410"

@jercaianu
Copy link
Contributor Author

ping @andralex @edi33416

Copy link
Contributor

@wilzbach wilzbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick initial feedback - mostly on the style. This is not a review.
I think you could save a lot of code if you would use a custom, local atomicOp - maybe this can even be done with alias this'ing the respective integral type?

@@ -0,0 +1,629 @@
// Written in the D programming language.
/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide a description here.

@@ -0,0 +1,629 @@
// Written in the D programming language.
/**
Source: $(PHOBOSSRC std/experimental/allocator/building_blocks/_aligned_block_list.d)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preventing Ddoc's auto-hightlighting is no longer required - dlang/dlang.org#2307

static if (isShared)
SpinLock lock = SpinLock(SpinLock.Contention.brief);

private void moveToFront(AlignedBlockNode *tmp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document

}
}

AlignedBlockNode *root;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DStyle

enum maxNodes = 50;

static if (isShared)
SpinLock lock = SpinLock(SpinLock.Contention.brief);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent (DStyle)


void fun()
{
auto rnd = Random();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Use a fixed seed and a fixed engine, everything else might lead to random coverage changes.

a.allocatorForSize!512.parent = &pageAlloc;
a.allocatorForSize!256.parent = &pageAlloc;

auto rnd = Random();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seed + fix the generator

Although allocations are in theory served in linear searching time, `deallocate` calls take
$(BIGOH 1) time, by using aligned allocations. `ParentAllocator` must implement `alignedAllocate`
and it must be able to allocate `theAlignment` bytes at the same alignment. Each aligned allocation
done by `ParentAllocator` will contain metadata for an `Allocator`, followed by its payload.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a ddoc Params section.

size_t goodAllocSize(const size_t n)
{
Allocator a = null;
return a.goodAllocSize(n);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this segfault?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If goodAllocSize does not use internal allocator state (and it shouldn't) it won't segfault.
Currently, this would work with all allocators in experimental

static if (isShared)
{
atomicOp!"+="(tmp.bytesUsed, n);
lock.lock();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You never unlock this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit above I have a scope(exit) lock.unlock()

import core.internal.spinlock : SpinLock;

private:
struct AlignedBlockNode
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static struct?

@jercaianu
Copy link
Contributor Author

@wilzbach Thanks for comments

@jercaianu jercaianu force-pushed the alignedBlockList branch 4 times, most recently from 090e429 to 2f9396d Compare April 3, 2018 22:02
Copy link
Contributor

@wilzbach wilzbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import std.std doesn't work.

{
import std.experimental.allocator.building_blocks.region : SharedRegion;
import std.experimental.allocator.building_blocks.ascending_page_allocator : SharedAscendingPageAllocator;
import std.std.experimental.allocator.building_blocks.null_allocator : NullAllocator;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std/experimental/allocator/building_blocks/aligned_block_list.d(512): Error: module null_allocator is in file 'std/std/experimental/allocator/building_blocks/null_allocator.d' which cannot be read
import path[0] = ../druntime/import
posix.mak:350: recipe for target 'generated/linux/debug/32/unittest/std/experimental/allocator/building_blocks/aligned_block_list.o' failed
make[1]: *** [generated/linux/debug/32/unittest/std/experimental/allocator/building_blocks/aligned_block_list.o] Error 1

-> You have a std.std

@jercaianu jercaianu force-pushed the alignedBlockList branch 2 times, most recently from 5fa6a6b to 478ecd5 Compare April 4, 2018 11:31
int numNodes;

// If the numNodes exceeds this limit, we will start deallocating nodes
enum maxNodes = 50;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious as to where this number came from. Was it just a gut feeling? Did you make a decision to bias towards "probably not too high" or "probably not too low"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on what the caller passes, maxNodes should be different.
The deallocation of the blocks provided by ParentAllocator happen lazily inside allocate.
I didn't want to delete nodes too often, but also I didn't want to have high memory usage so I hardcoded something for my tests.
I'll investigate here a bit more to see if I can get rid of this, or maybe pass it as a parameter somehow.
Thanks!

auto next = tmp.next;
if (tmp.prev) tmp.prev.next = tmp.next;
if (tmp.next) tmp.next.prev = tmp.prev;
assert(parent.deallocate((cast(void*) tmp)[0 .. alignment]));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In release mode assertions are omitted and this code won't be run.

Example: https://run.dlang.io/is/m8gzG3

import std.stdio;
auto foo(ref int x)
{
    return x &= 1;
}
void main(string[] args)
{
    int x = 255;
    assert(foo(x));
    writeln(x);
}

This prints 1 when run with no compiler arguments but prints 255 with -release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good catch!

static if (stateSize!ParentAllocator) ParentAllocator parent;
else alias parent = ParentAllocator.instance;

enum ulong alignment = theAlignment;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how this could be correct if theAlignment is the size of each node and memory is allocated at an offset from the start of the node.

Copy link
Contributor Author

@jercaianu jercaianu Apr 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the metadata for each node and the payload are adjacent in memory, I considered the alignment to be for the whole block (metadata + payload).
Inside the payload, an Allocator can manage the allocations with its own alignment.

Edit: nevermind I see the problem

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the same time, I think it's not great for the caller to expect each allocation to have an alignment, but we silently add some padding before.
I think it's better for the alignment to be (metadata + payload), but clearly specify that and expose the length of the metadata.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because other allocators in std.experiment.allocator use alignment to specifically mean the alignment of memory returned by allocate, use a different name here and everything will be fine. Otherwise, templated code that uses design-by-introspection can leap to a wrong conclusion -- the compiler doesn't read the DDOC, so a comment will not prevent that!

done by `ParentAllocator` will contain metadata for an `Allocator`, followed by its payload.
Params:
Allocator = the allocator which is used to manage each node
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation does an insufficient job of explaining what needs to be true of the Allocator!

Copy link
Contributor Author

@jercaianu jercaianu Apr 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right, I haven't explained here too much about the Allocator.
Allocator can be anything, as long as it has a constructor which receives ubyte[] and its ParentAllocator is the NullAllocator.

For the moment, these are the only constraints which come to mind.
I'll update the docs
Thanks!

@n8sh
Copy link
Member

n8sh commented Apr 4, 2018

It isn't clear to me that the shared version is correct. Does using a spinlock prevent stale reads of the root pointer and prevent stale reads of AlignedBlockNode's next/prev pointers?

@jercaianu
Copy link
Contributor Author

@n8sh
Thanks for the review!
The list only is changed when the lock is taken. Only under the lock new nodes are added/removed and root updated.
To increase performance, allocate happens when the lock is not taken. Each node has keepAlive, which basically acts as a refcount to make sure no node is removed while it is used for allocations, outside the lock.

Besides the unittests, I stress tested this and I haven't found any races.

else alias parent = ParentAllocator.instance;

enum ulong alignment = theAlignment;
enum ulong alignment = ParentAllocator.alignment;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If alignment means the alignment of the memory returned by allocate, it would appear this should be Allocator.alignment instead of ParentAllocator.alignment.

Also maybe it should have a guard like:

static if (hasStaticallyKnownAlignment!Allocator)
    enum alignment = Allocator.alignment;

The existence of std.experimental.allocator.common : hasStaticallyKnownAlignment implies that not all allocators have statically known alignments, but this may be obsolete or not applicable to the specific allocators required here.

Copy link
Member

@andralex andralex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please mind @wilzbach 's and my nits

import std.experimental.allocator.building_blocks.null_allocator;

// Common function implementation for thread local and shared AlignedBlockList
private mixin template AlignedBlockListImpl(bool isShared)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bool is OK for a private artifact

AlignedBlockNode* root;

// Number of active nodes
int numNodes;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uint

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't size_t be better?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size_t is size of objects in memory. For counts we can use that of course, but just incidentally.

int numNodes;

// If the numNodes exceeds this limit, we will start deallocating nodes
enum maxNodes = 64;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enum uint


enum ulong alignment = Allocator.alignment;

static if (hasMember!(ParentAllocator, "owns"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

document

@dlang-bot dlang-bot merged commit c5129be into dlang:master Apr 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants