-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(hog): lambdas #24369
feat(hog): lambdas #24369
Conversation
It looks like the code of |
It looks like the code of |
It looks like the code of |
All code has been extracted from this branch, thus closing the PR. 🫡 |
Problem
We need to support lambdas in order to support many ClickHouse functions (
arrayExists
,arrayMap
, etc). This is required to match element chain texts in filters in Hog destinations.Changes
Language features
Functions are now first class variables and can be written as lambdas:
We can now call
()
things that are not just identifiers (name()
):Closures
Implements closures and upvalues, making the following work as expected:
Inline STL
Adds support for
arrayMap
,arrayExists
,arrayFilter
:This is where it gets a bit tricky. Those are added via an "inlined STL". Effectively the bytecode compiler does a round of static analysis and figures out if any of those STL functions will be called. If so, it appends source code for the function before your script.
All three functions are written as Hog functions behind the scenes, for example:
This was easier (and safer) to get working than implementing a back-and-forth layer between the VM and "native code". We can today call native code from the VM (all those STL functions), but to the do an UNO reverse and call a VM function from native code... requires a refactor too big for this PR.
Currently those STL functions are inlined into the Hog bytecode. I'm not sure if we want to keep it that way, or ship the function bytecodes with each HogVM itself, but this can be changed later.
There are some positive things about having a STL written within Hog: 1) it's easier to extend, and 2) it requires less changes in all the different implementations of Hog (Python vs TS vs future Rust?)
Bytecode versions
Until now bytecode was in the format:
['_h', /* rest of bytecode */]
That's considered version 0.
Now compiled bytecode looks like
['_H', 1, /* rest of bytecode */]
The
1
is the bytecode version field. I had to make some breaking changes from v0 due to optional function arguments (needed to flip something around), yet wanted to not break all existing compiled bytecode. Versions are a way to do this.I still need to verify, but all existing bytecode should work with both VMs. The slow migration path would be:
Global access
Accessing undefined globals is now a compile time error. Previously they would just return silent nulls.
TODO
x()()
)()(...)
:How did you test this code?
WIP