-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement the pooling instance allocator. #2518
Conversation
Hmm, |
Subscribe to Label Actioncc @peterhuene
This issue or pull request has been labeled: "cranelift", "cranelift:wasm", "wasmtime:api"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
The aarch64 tests run on qemu, which has this lovely bit of implementation in its syscall handling (link):
At some point we'd ideally run our tests on real hardware, but for now I suspect the best option would just be to disable the unit test and anything that depends on the zeroing semantics. (Or maybe a feature flag to eplicitly memset/bzero instead? That's bad if it was only sparsely committed though...) |
We already have a different flag for running in qemu to reduce virtual memory usage overhead since QEMU behaves differently than native processes in that respect, so perhaps that could also be where we configure "hey wasmtime madvise doesn't work as expected" |
b6abb64
to
fbcb0c6
Compare
b55a66a
to
404812e
Compare
Subscribe to Label Actioncc @peterhuene
This issue or pull request has been labeled: "wasmtime:c-api"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
b2841d7
to
03118ad
Compare
03118ad
to
f229ba7
Compare
80c1804
to
db9e04d
Compare
2ab95ca
to
98853b1
Compare
This change makes the storage of `Table` more internally consistent. Elements are stored as raw pointers for both static and dynamic table storage. Explicitly storing elements as pointers removes assumptions being made by the pooling allocator in terms of the size and default representation of the elements. However, care must be made to properly clone externrefs for table operations.
This commit fails translation of modules that have an segment offset, when added to the data length, overflows.
f624882
to
c1231e7
Compare
This commit adds a "pooling" variant to the wast tests that uses the pooling instance allocation strategy. This should help with the test coverage of the pooling instance allocator.
This commit extracts out a common pattern of finding a passive element or data segment into a `find_passive_segment` method.
c1231e7
to
8e51aef
Compare
@alexcrichton I think all of the feedback has now been addressed. FYI: the commit you stopped your review at was "fix bad merge". This is also running the wast tests with the pooling allocator (+uffd on linux). My TODO list following this PR:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything here's looking great to me, thanks again for being patient with me :)
As one final follow-up question as well, in the future (not as part of this PR since it's fine as-is), do you think it would be possible to use malloc/free to manage instance/table pools? The memory pool makes sense to use mmap directly since we're doing such funky things with permissions (and it's all page-aligned anyway), and the stack pool makes sense since it's all page aligned too. For instance/tables, however, it seems like there will inevitable be some degree of fragmentation because we round up to page sizes, but I don't think it's strictly necessary (especially now that uffd isn't watching those areas) and we could use malloc/free perhaps?
Otherwise I've got a few more questions/follow-ups below, but otherwise r=me
* Improve comments. * Drop old table element *after* updating the table. * Extract out the same `cfg_if!` to a single constant.
fdc1b09
to
3d05a4f
Compare
Regarding the page alignment for instances and tables: we don't need it for instances for the pooling allocator, but we do need it for tables because the pooling allocator "decommits" table memory by page and you don't want any page to have elements from multiple tables. The reason instances are paged-aligned is that this is an artifact from when the pooling allocator had one giant mmap'd region and the memories/tables of the instance came next, so they needed page-alignment back then. I'll fix the instance pool to not page-align the instances and instead align to Regarding malloc/free for instances and tables, I think there's no inherent reason why they can't be used, but part of the intention of this design was to preallocate both (a desirable thing to do for a service) and as |
3d05a4f
to
f1c0c73
Compare
This commit moves the tracking for faulted guard pages in a linear memory into `Memory`.
This commit updates the error enums used in instantiation errors to encapsulate an `anyhow::Error` rather than a string.
f1c0c73
to
623290d
Compare
We currently skip some tests when running our qemu-based tests for aarch64 and s390x. Qemu has broken madvise(MADV_DONTNEED) semantics -- specifically, it just ignores madvise() [1]. We could continue to whack-a-mole the tests whenever we create new functionality that relies on madvise() semantics, but ideally we'd just have emulation that properly emulates! The earlier discussions on the qemu mailing list [2] had a proposed patch for this, but (i) this patch doesn't seem to apply cleanly anymore (it's 3.5 years old) and (ii) it's pretty complex due to the need to handle qemu's ability to emulate differing page sizes on host and guest. It turns out that we only really need this for CI when host and guest have the same page size (4KiB), so we *could* just pass the madvise()s through. I wouldn't expect such a patch to ever land upstream in qemu, but it satisfies our needs I think. So this PR modifies our CI setup to patch qemu before building it locally with a little one-off patch. [1] bytecodealliance#2518 (comment) [2] https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg05416.html
We currently skip some tests when running our qemu-based tests for aarch64 and s390x. Qemu has broken madvise(MADV_DONTNEED) semantics -- specifically, it just ignores madvise() [1]. We could continue to whack-a-mole the tests whenever we create new functionality that relies on madvise() semantics, but ideally we'd just have emulation that properly emulates! The earlier discussions on the qemu mailing list [2] had a proposed patch for this, but (i) this patch doesn't seem to apply cleanly anymore (it's 3.5 years old) and (ii) it's pretty complex due to the need to handle qemu's ability to emulate differing page sizes on host and guest. It turns out that we only really need this for CI when host and guest have the same page size (4KiB), so we *could* just pass the madvise()s through. I wouldn't expect such a patch to ever land upstream in qemu, but it satisfies our needs I think. So this PR modifies our CI setup to patch qemu before building it locally with a little one-off patch. [1] bytecodealliance#2518 (comment) [2] https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg05416.html
We currently skip some tests when running our qemu-based tests for aarch64 and s390x. Qemu has broken madvise(MADV_DONTNEED) semantics -- specifically, it just ignores madvise() [1]. We could continue to whack-a-mole the tests whenever we create new functionality that relies on madvise() semantics, but ideally we'd just have emulation that properly emulates! The earlier discussions on the qemu mailing list [2] had a proposed patch for this, but (i) this patch doesn't seem to apply cleanly anymore (it's 3.5 years old) and (ii) it's pretty complex due to the need to handle qemu's ability to emulate differing page sizes on host and guest. It turns out that we only really need this for CI when host and guest have the same page size (4KiB), so we *could* just pass the madvise()s through. I wouldn't expect such a patch to ever land upstream in qemu, but it satisfies our needs I think. So this PR modifies our CI setup to patch qemu before building it locally with a little one-off patch. [1] #2518 (comment) [2] https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg05416.html
We currently skip some tests when running our qemu-based tests for aarch64 and s390x. Qemu has broken madvise(MADV_DONTNEED) semantics -- specifically, it just ignores madvise() [1]. We could continue to whack-a-mole the tests whenever we create new functionality that relies on madvise() semantics, but ideally we'd just have emulation that properly emulates! The earlier discussions on the qemu mailing list [2] had a proposed patch for this, but (i) this patch doesn't seem to apply cleanly anymore (it's 3.5 years old) and (ii) it's pretty complex due to the need to handle qemu's ability to emulate differing page sizes on host and guest. It turns out that we only really need this for CI when host and guest have the same page size (4KiB), so we *could* just pass the madvise()s through. I wouldn't expect such a patch to ever land upstream in qemu, but it satisfies our needs I think. So this PR modifies our CI setup to patch qemu before building it locally with a little one-off patch. [1] bytecodealliance#2518 (comment) [2] https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg05416.html
This PR implements the pooling instance allocator.
The allocation strategy can be set with
Config::with_allocation_strategy
.The pooling strategy uses the pooling instance allocator to preallocate a
contiguous region of memory for instantiating modules that adhere to various
limits.
The intention of the pooling instance allocator is to reserve as much of the
host address space needed for instantiating modules ahead of time.
This PR also implements the
uffd
feature in Wasmtime that enableshandling page faults in user space; this can help to reduce kernel lock
contention and thus increase throughput when many threads are
continually allocating and deallocating instances.
See the related RFC.