Add support for using a dynamic variable resolver and a Value::DynamicCollection that can dynamically resolve nested values of collection types. #58

chirino · 2024-07-02T15:13:43Z

This is an improved version of #47 it has been updated to include the suggestions given in #47.

chirino · 2024-07-03T14:44:44Z

Added a slightly modified version of the benchmark @clarkmcc provided in #47
This variation lets you easily switch between static or dynamic resolvers. It constructs the Context within the iteration loop since that better compares the overhead of populating the context in the static case.

Dynamic resolution seems to be faster, about 20%:

     Running benches/runtime.rs (target/release/deps/runtime-6a18d2a40aded4e1)
variable resolution/variable_resolution_1
                        time:   [49.821 ns 49.889 ns 49.968 ns]
                        change: [-32.108% -31.969% -31.813%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe
variable resolution/variable_resolution_10
                        time:   [490.61 ns 495.56 ns 501.14 ns]
                        change: [-23.252% -22.414% -21.410%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild
variable resolution/variable_resolution_100
                        time:   [5.6754 µs 5.7097 µs 5.7500 µs]
                        change: [-19.484% -18.817% -18.102%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  8 (8.00%) high mild
  1 (1.00%) high severe

clarkmcc · 2024-07-03T16:03:17Z

It's likely faster because we're comparing apples to oranges. In this benchmark you're really seeing the difference between a hashmap lookup to produce a Value::Null and skipping the hashmap altogether (as is the case with dynamic resolution). The problem with this is it doesn't accurately capture the overhead of only dynamic resolution. To me this just tells me that it's faster to return Ok(Value::Null) than it is to hashmap.get(variable) which makes perfect sense.

To properly benchmark the overhead of dynamic resolution, the hashmap lookup should serve as the control in the experiment. This means the dynamic resolution should also include the same hashmap lookup.

For the benchmark, what happens in the dynamic resolver doesn't really matter so much as that it does the same thing as the standard resolver. In practice, the dynamic resolver is almost always going to be slower than the standard resolver because, at least in your use-case, you'll be making HTTP requests in the dynamic resolver. So the benchmark should control for the hashmap in my opinion.

I believe these adjustments more accurately control for the resolver's overhead.

Index: interpreter/benches/runtime.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/interpreter/benches/runtime.rs b/interpreter/benches/runtime.rs
--- a/interpreter/benches/runtime.rs	(revision 38c263c5ea544cd8b2509853b9fe5bbe62741eaf)
+++ b/interpreter/benches/runtime.rs	(date 1720022327640)
@@ -1,5 +1,5 @@
 use cel_interpreter::context::Context;
-use cel_interpreter::{Program, Value};
+use cel_interpreter::{ExecutionError, Program, Value};
 use criterion::{black_box, criterion_group, criterion_main, Criterion};
 use std::collections::HashMap;
 
@@ -71,7 +71,7 @@
     let sizes = vec![1, 10, 100];
 
     // flip this bool to compare the performance of dynamic resolver vs static resolvers
-    let use_dynamic_resolver = true;
+    let use_dynamic_resolver = false;
 
     for size in sizes {
         let mut expr = String::new();
@@ -86,13 +86,22 @@
         group.bench_function(format!("variable_resolution_{}", size).as_str(), |b| {
             let mut ctx = Context::default();
             if use_dynamic_resolver {
-                ctx.set_dynamic_resolver(move |_, _| Ok(Value::Null));
+                let mut variables: HashMap<String, Value> = HashMap::default();
+                for i in 0..size {
+                    variables.insert(format!("var_{i}", i = i), Value::Null);
+                }
+                ctx.set_dynamic_resolver(move |_, v| {
+                    variables
+                        .get(v)
+                        .cloned()
+                        .ok_or(ExecutionError::UndeclaredReference(v.to_string().into()))
+                });
             } else {
                 for i in 0..size {
                     ctx.add_variable_from_value(&format!("var_{i}", i = i), Value::Null);
                 }
-            }
-            b.iter(|| program.execute(&ctx).unwrap())
+            };
+            b.iter(|| program.execute(&ctx))
         });
     }
     group.finish();
@@ -100,8 +109,8 @@
 
 criterion_group!(
     benches,
-    criterion_benchmark,
-    map_macro_benchmark,
+    // criterion_benchmark,
+    // map_macro_benchmark,
     variable_resolution_benchmark
 );

When measuring only the overhead of the dynamic resolver, we get ~50% performance regression

Benchmarking variable resolution/variable_resolution_1: Collecting 100 samples in estimated 5.0004 s (51Mvariable resolution/variable_resolution_1
                        time:   [95.403 ns 95.649 ns 95.907 ns]
                        change: [+53.123% +54.927% +56.336%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
  6 (6.00%) high mild
  1 (1.00%) high severe
Benchmarking variable resolution/variable_resolution_10: Collecting 100 samples in estimated 5.0029 s (5.variable resolution/variable_resolution_10
                        time:   [980.32 ns 986.39 ns 992.87 ns]
                        change: [+47.167% +48.708% +50.222%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild
Benchmarking variable resolution/variable_resolution_100: Collecting 100 samples in estimated 5.0526 s (4variable resolution/variable_resolution_100
                        time:   [11.307 µs 11.360 µs 11.416 µs]
                        change: [+51.871% +53.270% +54.583%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

chirino · 2024-07-04T14:52:23Z

So the use case of using http to resolve values was just an illustration. My actual use case looks more like I get a gRPC request with a large payload document. I want to run a CEL expression against that document. Converting that whole document to cel Values seems like more overhead than dynamically picking out the fields of that document based on the expression.

clarkmcc · 2024-07-04T16:56:10Z

That sounds like a more legitimate use-case.

How are you envisioning we'd preserve the spec's Field Selection requirements with a dynamic resolver present? In order for the dynamic resolver to be useful, you want the full path of the expression before you do any resolving. But that doesn't align with the spec and would result in different error behaviors depending on whether the dynamic resolver is added or not.

If we did follow the spec on field selection, then the dynamic resolver would need to traverse each layer of the object converting each layer to a CEL value, which doesn't really get you the wins that you're looking for.

chirino · 2024-07-04T22:02:50Z

I'll try to update the test_deep_dynamic_resolver test in this PR to be a better example of the usecase. Maybe you can help figure out the edge cases where if fails the spec expectations.

chirino · 2024-07-05T15:18:33Z

Ok.. the latest commits change things a bit, instead of a Value::Any we add a Value::DynamicCollection which holds a resolver closure. This made it a bit simpler to write the more realistic deep resolver test case. This also further simplified the resovler function signature.

chirino · 2024-07-08T12:39:19Z

Added a feature that integrates with Parc types which are like Arcs but which let you access deep references of the object via reference projections.

clarkmcc · 2024-07-08T17:12:09Z

That's pretty awesome actually! But I am concerned that they're pretty convoluted to use. Do you think a derive macro could generate this for a struct?

chirino · 2024-07-08T17:40:24Z

derive macro might be possible, but that should likely be a separate PR.

chirino · 2024-07-08T17:43:05Z

One thing I might want to do part of this PR is see if it's possible to provide a blanket impl a From<Vec where T: DynamicCollection which should simplify the vec indexing cases.

clarkmcc · 2024-07-08T19:38:47Z

derive macro might be possible, but that should likely be a separate PR.

Could be. I'm just looking at what it takes to use, and I don't see this feature being very useful if it requires all this manual boilerplate. Imo they're pretty tightly coupled. But if we can get a separate PR up at the same time then that works for me. What I don't want is to introduce a feature that is unwieldy to use and then never get around to cleaning up the user experience.

One thing I might want to do part of this PR is see if it's possible to provide a blanket impl a From<Vec where T: DynamicCollection which should simplify the vec indexing cases.

👍

chirino · 2024-07-09T16:21:40Z

@clarkmcc ok I think the last bits are in there. Could we get this in without the macro? Our project could take advantage of this as is.

clarkmcc · 2024-07-09T16:26:09Z

Let me play around with the PR now that it's ready to go, it's big enough I can't once-over it here in Github 😁

chirino · 2024-07-09T16:34:10Z

ok thx.

…cCollection variant to the Value type * includes a benchmark * users can implement a DynamicCollection trait to aid in struct conversions to Values * added pared feature which integrates the pared dependency so that you can use Arc projection refs to documents avoiding heavy clones when using a dynamic resolver. Signed-off-by: Hiram Chirino <hiram@hiramchirino.com>

chirino · 2024-07-16T14:55:14Z

squashed and rebased to avoid merge conflicts.

clarkmcc · 2024-07-16T15:12:57Z

Thanks, I took a look at this on Saturday but didn't finish going through it. Still WIP

chirino · 2024-08-02T16:28:35Z

any updates?

clarkmcc · 2024-08-02T20:20:47Z

I'm working through some ideas based on #73. Seems like the right solution could solve #73, this, and what you're proposing here -- allowing CEL to get arc references to fields on Rust structs and skip serializing the objects altogether.

In other words, I'm not sure about this PR as it currently stands, but I want to take a lot of what you have here to improve this project on several different fronts.

chirino force-pushed the any-value branch from f1261e7 to 38c263c Compare July 3, 2024 14:39

chirino mentioned this pull request Jul 3, 2024

Add support for dynamic value resolver and the ability for values to resolve to an Any value #47

Closed

chirino force-pushed the any-value branch from 38c263c to 1771ec4 Compare July 3, 2024 15:14

chirino force-pushed the any-value branch 6 times, most recently from 24220a3 to 60cbf7d Compare July 5, 2024 15:06

chirino changed the title ~~Add support for using a dynamic variable resolver and an Any variant to the Value type~~ Add support for using a dynamic variable resolver and a Value::DynamicCollection that can dynamically resolve nested values of collection types. Jul 5, 2024

chirino force-pushed the any-value branch from 3a811ce to d8fa696 Compare July 6, 2024 17:24

chirino force-pushed the any-value branch from 9a9d4ba to 80a82dd Compare July 16, 2024 14:53

clarkmcc mentioned this pull request Aug 3, 2024

Zero-allocation* redesign #76

Open

alexsnaps mentioned this pull request Oct 17, 2024

CEL expressions Kuadrant/wasm-shim#112

Closed

Caellian mentioned this pull request Oct 17, 2024

Add Value::Dynamic variant #92

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for using a dynamic variable resolver and a Value::DynamicCollection that can dynamically resolve nested values of collection types. #58

Add support for using a dynamic variable resolver and a Value::DynamicCollection that can dynamically resolve nested values of collection types. #58

chirino commented Jul 2, 2024

chirino commented Jul 3, 2024

clarkmcc commented Jul 3, 2024

chirino commented Jul 4, 2024

clarkmcc commented Jul 4, 2024

chirino commented Jul 4, 2024

chirino commented Jul 5, 2024

chirino commented Jul 8, 2024

clarkmcc commented Jul 8, 2024

chirino commented Jul 8, 2024

chirino commented Jul 8, 2024

clarkmcc commented Jul 8, 2024

chirino commented Jul 9, 2024

clarkmcc commented Jul 9, 2024

chirino commented Jul 9, 2024

chirino commented Jul 16, 2024

clarkmcc commented Jul 16, 2024

chirino commented Aug 2, 2024

clarkmcc commented Aug 2, 2024

Add support for using a dynamic variable resolver and a Value::DynamicCollection that can dynamically resolve nested values of collection types. #58

Are you sure you want to change the base?

Add support for using a dynamic variable resolver and a Value::DynamicCollection that can dynamically resolve nested values of collection types. #58

Conversation

chirino commented Jul 2, 2024

chirino commented Jul 3, 2024

clarkmcc commented Jul 3, 2024

chirino commented Jul 4, 2024

clarkmcc commented Jul 4, 2024

chirino commented Jul 4, 2024

chirino commented Jul 5, 2024

chirino commented Jul 8, 2024

clarkmcc commented Jul 8, 2024

chirino commented Jul 8, 2024

chirino commented Jul 8, 2024

clarkmcc commented Jul 8, 2024

chirino commented Jul 9, 2024

clarkmcc commented Jul 9, 2024

chirino commented Jul 9, 2024

chirino commented Jul 16, 2024

clarkmcc commented Jul 16, 2024

chirino commented Aug 2, 2024

clarkmcc commented Aug 2, 2024