Add initial support for WebAssembly in user-defined functions (UDF) #9108

psarna · 2021-07-28T14:22:37Z

This series adds very basic support for WebAssembly-based user-defined functions.

This series comes with a basic set of tests which were used to designate a minimal goal for this initial implementation.

Example usage:

CREATE FUNCTION ks.fibonacci (str text)
    RETURNS NULL ON NULL INPUT
    RETURNS boolean
    LANGUAGE xwasm
    AS ' (module
  (func $fibonacci (param $n i32) (result i32)
    (if
      (i32.lt_s (local.get $n) (i32.const 2))
      (return (local.get $n))
    )
    (i32.add
      (call $fibonacci (i32.sub (local.get $n) (i32.const 1)))
      (call $fibonacci (i32.sub (local.get $n) (i32.const 2)))
    )
  )
  (export "fibonacci" (func $fibonacci))
) '

Note that the language is currently called "xwasm" as in "experimental wasm", because its interface is still subject to change in the future.

dorlaor · 2021-07-28T16:20:41Z

Very exciting with not much code needed! Wow

…

On Wed, Jul 28, 2021 at 7:22 AM Piotr Sarna ***@***.***> wrote: This series adds very basic support for WebAssembly-based user-defined functions. It's only published as draft due to multiple reasons: - only a few basic types are implemented so far (string, int, bool) - Fedora 33's (and probably 34's) v8-devel package contains a version of the library which seemingly does not support creating a single-threaded v8 platform, and instead creates a thread pool underneath, which is out of the question for production - occasionally the execution of wasm-based UDF's fails with Uncaught RangeError: Maximum call stack size exceeded; I found denoland/rusty_v8#486 <denoland/rusty_v8#486> which may or may not be relevant, although it also failed in a single-shard Scylla, so perhaps it's something else; I tried out multiple stack-size values and too low values make this error come out every time, but even very large values (64MiB and beyond) do not prevent the error, which makes me think it's some kind of a race after all, especially that the functions used in tests are not particularly large - many more xfailing tests should be added to the cql-pytest-based suite - it's a low priority project, so its development pace is going to be slow This series comes with a basic set of tests which were used to designate a minimal goal for this initial implementation ------------------------------ You can view, comment on, or merge this pull request online at: #9108 Commit Summary - cql3: generalize user-defined functions for more languages - service: add v8 engine - wasm: add initial WebAssembly runtime implementation - treewide: add initial WebAssembly support to UDF - cql-pytest: add wasm-based tests for user-defined functions File Changes - *M* configure.py <https://github.com/scylladb/scylla/pull/9108/files#diff-4d5f3192809ec1b9add6b33007e0c50031ad9a0a2f3f55a481b506468824db2c> (6) - *M* cql3/functions/user_function.cc <https://github.com/scylladb/scylla/pull/9108/files#diff-a62e487ba8c6138647787ee95c9804404e3bf848b817895c5bb8501d4e41ba74> (15) - *M* cql3/functions/user_function.hh <https://github.com/scylladb/scylla/pull/9108/files#diff-5a0593faed1341ce16308a4b5293ef254fd34a37bfd7aa86a818adc327cfb811> (32) - *M* cql3/statements/create_function_statement.cc <https://github.com/scylladb/scylla/pull/9108/files#diff-2fd9642298844a7bcf239fc0a7962b6f13d1918340a1a8cf74804f73c4e1209f> (22) - *M* db/schema_tables.cc <https://github.com/scylladb/scylla/pull/9108/files#diff-2cc928e139ea64f5257e11df02cd2819944fc93210107633315ee2d0edd05c27> (26) - *M* install-dependencies.sh <https://github.com/scylladb/scylla/pull/9108/files#diff-4852d2a5a482159af418383e38099c1321b69c6272981405ed32d629a19de73d> (2) - *M* main.cc <https://github.com/scylladb/scylla/pull/9108/files#diff-bf1eacd22947b4daf9f4c2639427b8593d489f093eb1acfbba3e4cc1c9b0288b> (13) - *A* service/v8_platform.cc <https://github.com/scylladb/scylla/pull/9108/files#diff-8bfc1053920f0e573f4e7ee61a91ca483c147c54df17de6981f2afa981b27a4e> (39) - *A* service/v8_platform.hh <https://github.com/scylladb/scylla/pull/9108/files#diff-ccf66231493bc4d6ca0c52209e303ef57b4f938a2fbd6c7bd1ed7c218a23d25a> (36) - *A* test/cql-pytest/test_wasm.py <https://github.com/scylladb/scylla/pull/9108/files#diff-ba1d88f7ba5b62defa4c35ac8202b8aca9937c2b6196d8099027dcb15ed4d533> (78) - *A* wasm.cc <https://github.com/scylladb/scylla/pull/9108/files#diff-1bb7990fd2fa76e45520a9f308e8c2e932aeb48ee1be5e59d6fa99d6e030a7c9> (338) - *A* wasm.hh <https://github.com/scylladb/scylla/pull/9108/files#diff-aa94895055b55b0e011c84b8ef67450e78152e7ea73d0beac3859d69a5cd00f6> (69) Patch Links: - https://github.com/scylladb/scylla/pull/9108.patch - https://github.com/scylladb/scylla/pull/9108.diff — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#9108>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AANHURIXTOJNX56GTPTILK3T2AHDXANCNFSM5BEPCQLA> .

avikivity · 2021-07-28T16:52:05Z

I think we should look for an alternative to v8, it's a monster.

avikivity · 2021-07-28T16:54:03Z

wasm language should have as source wasm machine code (or maybe assembly code). The user can compile the code on their own and ship just the binary. If we compile JavaScript, the language would be JavaScript.

avikivity · 2021-07-28T16:56:55Z

wasm.cc

+    v8::Local<v8::Value> result;
+    if (!function->Call(local_ctx, local_ctx->Global(), argv.size(), argv.data()).ToLocal(&result)) {
+        throw make_wasm_exception("Running wasm function failed", isolate, try_catch);
+    }


I see you thoughtfully return a future, but we need to make the run preemptible somehow. Lua supports this by counting bytecode instructions and calling a callback on overflow.

I've been thinking about that, and I have a follow-up question - did this Lua trick prove useful in tests, or was it done as a matter of good practice? I'm asking because when it comes to UDF, simple transformations performed on a single row shouldn't usually be large enough to produce stalls - I'd imagine that it's usually refactoring a string or performing some simple math operations on the input, not actually computing things with an O(n^3) algorithm. The reason I ask is that Lua's trick also forced the call to be wrapped in a Seastar thread, which hurts its scalability - which can be alleviated a little e.g. by using a smaller stack, but it's still overhead. Perhaps the decision whether to mark the transformation preemptible could be left for users' consideration? I.e. by default the function call is not preemptible, which means it's either fast enough or the user doesn't care about occasional latency spikes (which would be common for analytical workloads, and I suspect UDF's are more useful for analytics than online queries), and the user can explicitly mark the function as heavy, which will make Scylla actually spawn a thread for it and preempt when needed.

Also, perhaps Seastar's cooperational model can be pushed down to UDF's, at least in case of javascript - it's probably possible to create a scylla_maybe_yield() function wrapper inside the v8 context (or any other engine), and calling this wrapper from inside a UDF would simply jump back to host and call thread::maybe_yield(). That would save us from counting bytecode and all. Finally, I suspect that these javascript engines won't become friends with our stall detector anyway, given that they often try just-in-time compilation and other tricks. But perhaps with enough configuration they can become more predictable.

avikivity · 2021-07-28T16:57:59Z

Very nice! But I have a feeling we'll need to look for more runtime candidates.

dorlaor · 2021-07-28T17:01:28Z

We can make progress with the api and the aggregation execution on the Scylla side while we use v8 as a temporary solution that works. This way we have a working solution in the interim (still in experimental of course)

…

On Wed, Jul 28, 2021 at 9:58 AM Avi Kivity ***@***.***> wrote: Very nice! But I have a feeling we'll need to look for more runtime candidates. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#9108 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AANHURPQFAF6GO4VRHBGVVLT2AZKFANCNFSM5BEPCQLA> .

psarna · 2021-07-29T08:27:20Z

wasm language should have as source wasm machine code (or maybe assembly code). The user can compile the code on their own and ship just the binary. If we compile JavaScript, the language would be JavaScript.

Makes sense. Nothing would prevent people from passing TypeScript scripts or anything else that the underneath engine is capable of compiling, but it doesn't really hurt. Also, I'm not an expert in all these *script languages, probably one is a superset of the other and nobody distinguishes them anyway.

psarna · 2021-07-29T08:27:56Z

Very nice! But I have a feeling we'll need to look for more runtime candidates.

A random list to consider based on 15 minute research:
https://github.com/cesanta/mjs (previously called v7)
https://duktape.org/
https://github.com/espruino/Espruino
https://firefox-source-docs.mozilla.org/js/index.html

psarna · 2021-07-29T08:38:10Z

Also, semi-related: https://github.com/ChaiScript/ChaiScript . A different language, but written with C++ integration in mind.

avikivity · 2021-07-29T08:52:57Z

https://github.com/bytecodealliance/wasmtime

avikivity · 2021-07-29T08:59:15Z

pub fn consume_fuel(&mut self, enable: bool) -> &mut Self
Configures whether execution of WebAssembly will “consume fuel” to either halt or yield execution as desired.

This option is similar in purpose to Config::interruptable where you can prevent infinitely-executing WebAssembly code. The difference is that this option allows deterministic execution of WebAssembly code by instrumenting generated code consume fuel as it executes. When fuel runs out the behavior is defined by configuration within a Store, and by default a trap is raised.

Note that a Store starts with no fuel, so if you enable this option you’ll have to be sure to pour some fuel into Store before executing some code.

By default this option is false.

Exactly what we need

psarna · 2021-07-29T09:00:22Z

Indeed, also, the webassembly text format looks workable for passing via CQL: https://github.com/bytecodealliance/wasmtime-cpp/blob/main/examples/fuel.wat

psarna · 2021-07-29T09:40:12Z

I'm out of fuel for the 20% project this week, but I'll try to create a wasmtime-based proof of concept next week.

avikivity · 2021-07-29T18:27:18Z

I see wasmtime isn't packaged in Fedora, so we'll have to curl it in install-dependencies.sh like node_exporter.

psarna · 2021-07-29T18:34:47Z

Yup, I already downloaded the binary package and tried it out, seems to work fine. The interface is much more friendly than v8, and the fuel thing solves the preemption issue. I'll try to push something workable for at least a couple of basic types early next week.

psarna · 2021-07-29T19:55:12Z

edit: I couldn't resist: a very limited proof-of-concept based on wasmtime is posted instead of the original v8-based implementation. It's very far from being complete, but still shows much better potential than the original engine.

psarna · 2021-07-30T09:35:00Z

Hm, unless I'm mistaken, it actually looks like the fuel mechanism can only be used to abort a function that takes too long, but not actually restore the bytecode execution. That's not extremely bad since we'd at least have an upper bound of allowed instructions, but it would limit the usefulness of wasm backend.

psarna · 2021-07-30T10:57:34Z

It looks like fuel is only partially supported in wasmtime-cpp bindings. Here's an excerpt from the original Rust docs (https://docs.wasmtime.dev/api/wasmtime/struct.Store.html#method.out_of_fuel_trap):

pub fn out_of_fuel_trap(&mut self)
Configures a Store to generate a Trap whenever it runs out of fuel.

When a Store is configured to consume fuel with Config::consume_fuel this method will configure what happens when fuel runs out. Specifically a WebAssembly trap will be raised and the current execution of WebAssembly will be aborted.

This is the default behavior for running out of fuel.

[src]
pub fn out_of_fuel_async_yield(
    &mut self,
    injection_count: u64,
    fuel_to_inject: u64
)
Configures a Store to yield execution of async WebAssembly code periodically.

When a Store is configured to consume fuel with Config::consume_fuel this method will configure what happens when fuel runs out. Specifically executing WebAssembly will be suspended and control will be yielded back to the caller. This is only suitable with use of a store associated with an async config because only then are futures used and yields are possible.

The purpose of this behavior is to ensure that futures which represent execution of WebAssembly do not execute too long inside their Future::poll method. This allows for some form of cooperative multitasking where WebAssembly will voluntarily yield control periodically (based on fuel consumption) back to the running thread.

Note that futures returned by this crate will automatically flag themselves to get re-polled if a yield happens. This means that WebAssembly will continue to execute, just after giving the host an opportunity to do something else.

The fuel_to_inject parameter indicates how much fuel should be automatically re-injected after fuel runs out. This is how much fuel will be consumed between yields of an async future.

The injection_count parameter indicates how many times this fuel will be injected. Multiplying the two parameters is the total amount of fuel this store is allowed before wasm traps.

... and it looks like only the trap-based mechanism, which, quote, aborts the current execution of WebAssembly, is implemented in wasmtime-cpp bindings so far.

psarna · 2021-08-02T18:48:57Z

Also, as for types other than int32s, int64s, floats and doubles - there used to be an implementation of "interface types", which allowed integrating strings and other custom types quite seamlessly, but they got deprecated and the new implementation is not shipped yet. Refs:
bytecodealliance/wasmtime#1271
bytecodealliance/wasmtime#677

That would make it a little more complicated to introduce support for all CQL types. I suppose we could define that all accepted wasm functions treat CQL types as a tuple of two ints - one of which indicates the address of the serialized value, and the other its length. Then, we just don't care about serialization/deserialization, because we would push the responsibility to the user-defined function body. Doesn't sound very convenient, but languages which compile to WebAssembly (e.g. Rust) usually have CQL serialization implemented for them already anyway.

avikivity · 2021-08-02T19:17:08Z

Well, we can implement the missing bindings, and start with the aborting variant. Assuming 1 insn == 1 ns, 1M fuel is 1ms, plenty for a simple UDF/UDA.

avikivity · 2021-08-02T19:17:22Z

And we can start with numeric types too.

psarna · 2021-08-03T10:07:57Z

v2:

added support for i32, i64, f32, f64 wasm types
got rid of all unsafe .unwrap() calls in favor of checking and propagating proper exceptions
added more test cases

Still missing:

support for downloading libwasmtime.a compiled for the proper architecture should be added. Wasmtime publishes precompiled libraries for aarch64 and x86-64, which pretty much covers our supported architectures, but conditional compilation should be added anyway, in case it's not possible to download this dependency.
Ref: https://github.com/bytecodealliance/wasmtime/releases/

To sum up, it's still a draft, because there's no automatic way of downloading a dependency and the compilation unconditionally assumes that libwasmtime.a should be linked to Scylla, but the series is much more ready to review than it was in v1.

avikivity · 2021-08-03T10:14:56Z

support for downloading libwasmtime.a compiled for the proper architecture should be added. Wasmtime publishes precompiled libraries for aarch64 and x86-64, which pretty much covers our supported architectures, but conditional compilation should be added anyway, in case it's not possible to download this dependency.
Ref: https://github.com/bytecodealliance/wasmtime/releases/

To sum up, it's still a draft, because there's no automatic way of downloading a dependency and the compilation unconditionally assumes that libwasmtime.a should be linked to Scylla, but the series is much more ready to review than it was in v1.

Please patch install-dependencies.sh with the download+install part. Send it as a separate patch and I will regenerate the toolchain when committing it. Note we generate the toolchain for s390x (though we don't use it), so it should be prepared to the binary not existing.

avikivity · 2021-08-03T10:18:04Z

wasmtime.hh

+ *
+ * This project is a C++ API for
+ * [Wasmtime](https://github.com/bytecodealliance/wasmtime). Support for the
+ * C++ API is exclusively built on the [C API of


Maybe better to do this as part of install-dependencies.sh and install it in /usr/local/include (I guess the library should be in /usr/local/lib64 too).

sure, I'll do that

avikivity · 2021-08-03T10:18:51Z

wasm.cc

@@ -0,0 +1,186 @@
+/*
+ * Copyright (C) 2019-present ScyllaDB


It's taken two years, but totally worth it.

avikivity · 2021-08-03T10:20:10Z

wasm.cc

+//       or drop the global in favor of a sharded service
+static wasmtime::Engine& wasm_engine() {
+    static thread_local wasmtime::Engine engine = wasmtime::Engine(make_config());
+    return engine;


thread_local is not that bad (also not very good, but at least doesn't stand in the way of cluster-in-a-box).

Still, better to have a sharded service so it can be configured with non-default config, if we ever need it.

Yup, I decided to leave it like that and quote the thread_local usage from our schema_registry.cc as my main line of defense. This engine is such a thin layer that creating a sharded service out of it and propagating it all over the source code seems a little complicated without any immediate gain - especially that we compile in two separate layers - once when the function is created, and then when it arrives to another node during schema migration.

There are some configuration options which could be useful for exposure - e.g. max wasm stack size, but on the other hand they're fairly low level and we should probably come up with sensible defaults instead.

In this iteration I decided to accept this little debt, since if anybody needs custom configuration later, I'll be the one who implements the proper sharded service anyway, so at least I only indebt myself.

avikivity · 2021-09-06T13:31:48Z

Alternatively, instead of always accepting null outputs, it would be tempting to decide that RETURNS NULL ON NULL INPUT also implies that non-null output is not allowed, but it remains allowed for CALLED ON NULL INPUT. That would allow us to achieve two things:

Returning values will be simplified for RETURNS NULL ON NULL INPUT functions, in particular, it will avoid allocations

If needed, UDF programmers can always change the function declaration to CALLED ON NULL INPUT in order to return nulls.

Another option: extend the syntax so you can write RETURNS int NOT NULL.

It makes even more sense if we think about user-defined aggregates, which are the expected target use case for UDFs: if the state transition function is allowed to return null, then it's likely that it should also accept null input, since the previous state might be null.

I think NULL still make sense. If the aggregate gives up because the input is bad in some way other than being NULL, it can return NULL and will not be called again for this aggregation.

@avikivity ?

psarna · 2021-09-06T13:35:47Z

Alternatively, instead of always accepting null outputs, it would be tempting to decide that RETURNS NULL ON NULL INPUT also implies that non-null output is not allowed, but it remains allowed for CALLED ON NULL INPUT. That would allow us to achieve two things:

Returning values will be simplified for RETURNS NULL ON NULL INPUT functions, in particular, it will avoid allocations

If needed, UDF programmers can always change the function declaration to CALLED ON NULL INPUT in order to return nulls.

Another option: extend the syntax so you can write RETURNS int NOT NULL.

It makes even more sense if we think about user-defined aggregates, which are the expected target use case for UDFs: if the state transition function is allowed to return null, then it's likely that it should also accept null input, since the previous state might be null.

I think NULL still make sense. If the aggregate gives up because the input is bad in some way other than being NULL, it can return NULL and will not be called again for this aggregation.

It would still be possible though - by using a function that is CALLED ON NULL INPUT, which would assume a slow path.

Yeah, extending the syntax is another idea to consider. The end of the day is near, so soon I'll post a version that accepts null outputs only for CALLED ON NULL INPUT as an RFC (of an RFC), because I have it almost ready anyway, and I'll consider other alternatives in my next 20% time slot.

avikivity · 2021-09-06T13:45:27Z

Sure.

psarna · 2021-09-06T13:45:39Z

v6:

functions CALLED ON NULL INPUT are capable of returning NULL output
to consider: how (and whether) to allow functions which RETURN NULL ON NULL INPUT to return nulls from non-null inputs

For reference, here's the uglified version of the example fib program, which accepts nulls and is also capable of returning nulls - which involves allocating memory in order to serialize a single bigint and return a pointer to it:

struct __attribute__((packed)) nullable_bigint {
    int size;
    long long v;
};

static long long swap_int64(long long val) {
    val = ((val << 8) & 0xFF00FF00FF00FF00ULL ) | ((val >> 8) & 0x00FF00FF00FF00FFULL );
    val = ((val << 16) & 0xFFFF0000FFFF0000ULL ) | ((val >> 16) & 0x0000FFFF0000FFFFULL );
    return (val << 32) | ((val >> 32) & 0xFFFFFFFFULL);
}

long long fib_aux(long long n) {
    if (n < 2) {
        return n;
    }
    return fib_aux(n-1) + fib_aux(n-2);
}

struct nullable_bigint* fib(struct nullable_bigint* p) {
    // Initialize memory for the return struct
    struct nullable_bigint* ret = (struct nullable_bigint*)__builtin_wasm_memory_size(0);
    __builtin_wasm_memory_grow(0, sizeof(struct nullable_bigint));

    ret->size = sizeof(long long);
    if (p->size == -1) {
        ret->v = swap_int64(42);
    } else {
        ret->v = swap_int64(fib_aux(swap_int64(p->v)));
    }
    return ret;
}

psarna · 2021-09-06T13:46:17Z

Of course, the uglification above should eventually be hidden behind a wrapper library that we provide.

psarna · 2021-09-10T08:30:58Z

v7:

sprinkled a bunch of FIXME's and TODO's explaining the current state of affairs - some design decisions, like ABI for returning values, is experimental and expected to change later, after more environment is built around our wasm support (e.g. helper libraries to interact with other programming languages)
added a comment explaining that we need to investigate all configuration options for wasmtime in order to customize it as much as possible for our needs
added a comment about the need to contribute more generic yielding support to wasmtime (written in Rust) and propagating this support to the cpp binding, which would eventually allow us to inject yielding code into the executing function, thus fighting reactor stalls
adjusted the docs to repreat the warning about return types and their experimental ABI
added a C example for writing and compiling a WebAssemby function with clang

I also hereby remove the [RFC] mark, to indicate that the series should be made ready for early merge - and that the development should continue, likely in a non-backward-compatible manner, taking advantage of the fact that we're hidden behind an experimental flag.

avikivity · 2021-09-10T18:49:11Z

To be xtra safe, let's change the language code from "wasm" to "x-wasm" so it's impossible for an old-abi function to be interpreted with the new abi, if/when we settle on that.

In order to support more languages than just Lua in the future, Lua-specific configuration is now extracted to a separate structure.

Support for more languages is comming, so let's group them in a separate directory.

Courtesy of https://github.com/bytecodealliance/wasmtime-cpp . Taken as is, with a small licensing blurb added on top.

WASM engine stores the wasm runtime engine for user-defined functions.

WASM engine needs to be used from two separate contexts: - when a user-defined function is created via CQL - when a user-defined function is received during schema migration The common instance that these two have in common is the database object, so that's where the reference is stored.

psarna · 2021-09-13T09:32:27Z

v8:

in order to be ready for ABI changes without breaking compatibility, the experimental ABI is now available as language "xwasm" (not x-wasm, because CQL grammar doesn't like hyphens in its tokens). Once the interface stabilizes, it should be changed to "wasm".

…(UDF)' from Piotr Sarna This series adds very basic support for WebAssembly-based user-defined functions. This series comes with a basic set of tests which were used to designate a minimal goal for this initial implementation. Example usage: ```cql CREATE FUNCTION ks.fibonacci (str text) RETURNS NULL ON NULL INPUT RETURNS boolean LANGUAGE xwasm AS ' (module (func $fibonacci (param $n i32) (result i32) (if (i32.lt_s (local.get $n) (i32.const 2)) (return (local.get $n)) ) (i32.add (call $fibonacci (i32.sub (local.get $n) (i32.const 1))) (call $fibonacci (i32.sub (local.get $n) (i32.const 2))) ) ) (export "fibonacci" (func $fibonacci)) ) ' ``` Note that the language is currently called "xwasm" as in "experimental wasm", because its interface is still subject to change in the future. Closes #9108 * github.com:scylladb/scylla: docs: add a WebAssembly entry cql-pytest: add wasm-based tests for user-defined functions main: add wasm engine instantiation treewide: add initial WebAssembly support to UDF wasm: add initial WebAssembly runtime implementation db: add wasm_engine pointer to database lang: add wasm_engine service import wasmtime.hh lua: move to lang/ directory cql3: generalize user-defined functions for more languages

avikivity · 2021-09-13T15:47:06Z

Dequeued. Please retest, also make sure it works without the webassembly dependencies installed.

The engine is based on wasmtime and is able to: - compile wasm text format to bytecode - run a given compiled function with custom arguments This implementation is missing crucial features, like running on any other types than 32-bit integers. It serves as a skeleton for future full implementation.

This commit adds a very basic support for user-defined functions coded in wasm. The support is very limited (only a few types work) and was not tested against reactor stalls and performance in general.

Once the engine is up, it can be used to execute user-defined functions.

A first set of wasm-based test cases is added. The tests include verifying that supported types work and that validation of the input wasm is performed.

The doc briefly describes the state of WASM support for user-defined functions.

psarna · 2021-09-13T17:05:48Z

v9:

fixed a compilation issue with wasm disabled (which happens when libwasmtime is not found)
fixed an issue with using a value from disengaged optional in order to try and grow memory inside a wasm module; it resulted in flaky tests, since sometimes wasm tried to allocate a huge buffer

psarna · 2021-09-13T17:10:04Z

Also - retested with and without libwasmtime

avikivity · 2021-09-14T13:19:31Z

Congrats!

avikivity reviewed Jul 28, 2021

View reviewed changes

psarna force-pushed the wasm_udf branch from 60fcf6a to 0f06df9 Compare July 29, 2021 19:54

psarna force-pushed the wasm_udf branch from 0f06df9 to 27f134f Compare July 30, 2021 10:37

psarna force-pushed the wasm_udf branch 2 times, most recently from 301d108 to 50a3ae4 Compare August 3, 2021 10:03

avikivity reviewed Aug 3, 2021

View reviewed changes

wasm.cc Outdated

@@ -0,0 +1,186 @@

/*

* Copyright (C) 2019-present ScyllaDB

Copy link

Member

avikivity Aug 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's taken two years, but totally worth it.

avikivity reviewed Aug 3, 2021

View reviewed changes

psarna force-pushed the wasm_udf branch from 936c6f1 to 324a229 Compare September 6, 2021 13:43

psarna force-pushed the wasm_udf branch from 324a229 to dd79bc2 Compare September 10, 2021 08:25

psarna changed the title ~~[RFC] Add initial support for WebAssembly in user-defined functions (UDF)~~ Add initial support for WebAssembly in user-defined functions (UDF) Sep 10, 2021

psarna added 5 commits September 13, 2021 11:01

cql3: generalize user-defined functions for more languages

46c6603

In order to support more languages than just Lua in the future, Lua-specific configuration is now extracted to a separate structure.

lua: move to lang/ directory

4e952df

Support for more languages is comming, so let's group them in a separate directory.

import wasmtime.hh

4caf57f

Courtesy of https://github.com/bytecodealliance/wasmtime-cpp . Taken as is, with a small licensing blurb added on top.

lang: add wasm_engine service

5e6fa47

WASM engine stores the wasm runtime engine for user-defined functions.

psarna force-pushed the wasm_udf branch from dd79bc2 to 04088a4 Compare September 13, 2021 09:30

psarna added 5 commits September 13, 2021 19:03

treewide: add initial WebAssembly support to UDF

62e8c89

This commit adds a very basic support for user-defined functions coded in wasm. The support is very limited (only a few types work) and was not tested against reactor stalls and performance in general.

main: add wasm engine instantiation

4959136

Once the engine is up, it can be used to execute user-defined functions.

cql-pytest: add wasm-based tests for user-defined functions

41b94d3

A first set of wasm-based test cases is added. The tests include verifying that supported types work and that validation of the input wasm is performed.

docs: add a WebAssembly entry

6c4a71c

The doc briefly describes the state of WASM support for user-defined functions.

psarna force-pushed the wasm_udf branch from 04088a4 to 6c4a71c Compare September 13, 2021 17:04

scylladb-promoter closed this in 3f2c680 Sep 14, 2021

scylladb-promoter merged commit 3f2c680 into scylladb:master Sep 14, 2021

Add initial support for WebAssembly in user-defined functions (UDF) #9108

Add initial support for WebAssembly in user-defined functions (UDF) #9108

Conversation

psarna commented Jul 28, 2021 • edited Loading

dorlaor commented Jul 28, 2021 via email

avikivity commented Jul 28, 2021

avikivity commented Jul 28, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

avikivity commented Jul 28, 2021

dorlaor commented Jul 28, 2021 via email

psarna commented Jul 29, 2021

psarna commented Jul 29, 2021 • edited Loading

psarna commented Jul 29, 2021

avikivity commented Jul 29, 2021

avikivity commented Jul 29, 2021

psarna commented Jul 29, 2021

psarna commented Jul 29, 2021

avikivity commented Jul 29, 2021

psarna commented Jul 29, 2021

psarna commented Jul 29, 2021

psarna commented Jul 30, 2021

psarna commented Jul 30, 2021

psarna commented Aug 2, 2021

avikivity commented Aug 2, 2021

avikivity commented Aug 2, 2021

psarna commented Aug 3, 2021 • edited Loading

avikivity commented Aug 3, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

avikivity commented Sep 6, 2021 • edited Loading

psarna commented Sep 6, 2021

avikivity commented Sep 6, 2021

psarna commented Sep 6, 2021 • edited Loading

psarna commented Sep 6, 2021

psarna commented Sep 10, 2021

avikivity commented Sep 10, 2021

psarna commented Sep 13, 2021

avikivity commented Sep 13, 2021

psarna commented Sep 13, 2021

psarna commented Sep 13, 2021

avikivity commented Sep 14, 2021

psarna commented Jul 28, 2021 •

edited

Loading

psarna commented Jul 29, 2021 •

edited

Loading

psarna commented Aug 3, 2021 •

edited

Loading

avikivity commented Sep 6, 2021 •

edited

Loading

psarna commented Sep 6, 2021 •

edited

Loading