maybe remove per-vat (xsnap) memory limit? #5953
Labels
enhancement
New feature or request
SwingSet
package: SwingSet
vaults_triage
DO NOT USE
xsnap
the XS execution tool
What is the Problem Being Solved?
In today's discussion about DoS attacks/defenses, one concern was an attacker provoking a vat into allocating too much memory (perhaps slowly), and triggering our 2GiB memory limit:
https://github.com/agoric-labs/xsnap-pub/blob/3e671651ed6d8a58491b500c9e7359494ff7da7d/xsnap/sources/xsnapPlatform.c#L290-L291
When xsnap sees the engine try to
malloc
past this limit, it exits withXS_NOT_ENOUGH_MEMORY_EXIT
, akaExitCode.E_NOT_ENOUGH_MEMORY
. The vat manager reports this as a delivery error, which tells the kernel to terminate the vat (in-consensus).In general, any hard resource limit that kills the process (under the theory that the vat did it deliberately) will convert a provoked-growth bug into a kill-the-vat authority. An attacker who manages to send a few extra arguments with each message (perhaps unused positional arguments, which JS functions ignore unless special schema enforcement is added), might cause the vat to consume extra memory to hold them. Over time, that might grow the memory footprint to the point that it trips the ceiling. If, at that point, the worker is killed, we'll have given the attacker more authority than we intended.
Some pathways we're examining for that consumption:
resolve
/reject
functions (and thus effectively the Promise itself) until the kernel does adispatch.notify
to resolve it.then
orawait
on it, so liveslots must assume userspace is watching until resolved)Presence
We have a size limit at the perimeter of the kernel (specifically the maximum cosmos signed-transaction size). That imposes a limit on the largest message that can make it into the bridge device, or the mailbox device. Framing, comms-protocol serialization, and other transformations are applied or removed, leading to some sort of limit (maybe larger, maybe smaller) on the properties of the capdata that arrives in
dispatch.deliver
ordispatch.notify
. Then liveslots andmarshal
turn this into JSObject
s andPromise
s andArray
s, etc, whose RAM usage is some function of the capdata sizes. If we could compute this overall transformation function, we could pick a signed-cosmos-txn limit X, and then state that the worst-case RAM usage that can be provoked by any single message is Y.If Y is larger than the per-vat xsnap allocation limit, a single message could kill the vat during deserialization. If userspace and/or liveslots holds on to data over time, a series of sub-X-sized cosmos messages could kill the vat.
We don't (yet) know what that
Y = f(X)
function is. We expect it's polylinear. And it's probably more accurately expressed asa * body.length + b * slots.map(filterObjectIDs).length + c * slots.map(filterPromiseIDs)
. And the slope is probably at least 10, since[],[],[],
takes only 3 bytes to make an empty Array (which is probably 4-8 32-bit words in RAM), or{},{},{},
. The cosmos limit is currently at least several MB, maybe 10MB (to accomodate contract bundles). When we finish #4564 and #4811 and friends, txns of this size will still be allowed, but the component messages inside will be limited according to type, and the messages which go to swingset will be limited to much smaller values (maybe 1MB, maybe more like 10kB).So we aren't confident that the perimeter limit is tight enough. And we might not be able to prevent liveslots from retaining some data. And userspace might not check for surprising extra properties on inbound objects that it stores. So the attack can be spread over multiple messages. The net result is that we might not be able to prevent attackers from inflating (perhaps slowly) the XS memory footprint of an innocent vat to the point that it passes a fixed limit.
But, if the vat is not intentionally using that extra data, then the Unix
xsnap
process which hosts the vat won't be touching it. The details are tricky, because the GCmark()
phase must walk some portion of all objects, but for some forms of data (especially large strings), the application won't touch those pages. And the standard unix virtual memory system is pretty good at not spending RAM on unused pages.So our thought is "thrash before crash". If we allow the xsnap process to grow large, but most of the pages are unused, the host OS will push those pages out to swap, preserving the RAM for only the active set. If the host has enough swap space, we can accomodate several inflated vats.
Description of the Design
I'm thinking:
change
xsnapPlatform.c
fxCreateMachinePlatform()
to increasethe->allocationLimit
to 20 GiB or 100 GiB.xsnap-worker.c
after thexsCreateMachine()
call, in response to anargv
parametertell validators they should provision a meaningful amount of swap
Security Considerations
A hard memory limit chooses one side of a tradeoff:
Since we don't have third-party contracts in MN-1, any buggy/malicious contracts are our own fault, and halting the whole chain is better than an in-consensus termination of some important vat.
Test Plan
A unit test which creates an xsnap worker configured with various
allocationLimit
s, evaluates code to build a string just below that limit (successfully), then allocates a little bit more (which should cause the worker to exit withExitCode.E_NOT_ENOUGH_MEMORY
).We already have a test to exceed the limit, but it works by allocating more and more until the limit is reached. We need to test that the configured limit is respected.
To avoid undue stress on the CI host (or developer's laptops), the test shouldn't exercise much larger than 2GB.
The text was updated successfully, but these errors were encountered: