You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As I've been writing Rego recently I've tried to embed terms inside each other, which seems to not be allowed. Besides it being natural to write rules with embedded terms, embedding terms should allow for improved efficiency during evaluation. I've been defining virtual docs that have dictionaries/arrays as outputs (e.g. {id: ..., name: ...}), and then want to use those virtual docs to ask if one of the outputs has a particular value for one of its fields (e.g. name: "foo").
Here's a contrived example of how we do that today. Notice that when we query p, we are iterating over all the values of p and checking if the first element is an "a".
It would be more efficient if the fact that the first value in the output is an "a" is pushed down into the computation of p so that we only compute the slice of p where every array is of the form ["a", y]. Here is how I would imagine this working in the future. I embedded 3 comments with // explaining the difference with the trace above.
q["a"] = 1 :- true
q["b"] = 2 :- true
p[[y,z]] :- q[y] = z // moved x=[y,z] into the head of the rule
p[["a", y]] // moved x=["a", y] into head of query
Enter data.repl.p[["a", y]]
| Eval data.repl.p[["a", y]]
| Enter p[["a", y]]
| | Eval eq(data.repl.q["a"], z)
| | Enter q["a"] = 1
| | | Eval true
| | | Exit q["a"] = 1
| | Eval eq(x, ["a", 1])
| | Exit p[["a", 1]]
| Eval eq(["a", 1], ["a", y])
| Exit data.repl.p[x], eq(x, ["a", y])
// used index to lookup just the q["a"] values, so not scanning thru all q data
Redo data.repl.p[x], eq(x, ["a", y])
| Redo true
| Redo p[["a", 1]]
| | Redo q["a"] = 1
| | Fail
| Fail data.repl.p[x], eq(x, ["a", y])
+---------+---+
| x | y |
+---------+---+
| ["a",1] | 1 |
+---------+---+
The text was updated successfully, but these errors were encountered:
timothyhinrichs
changed the title
Nesting terms should be permitted and exploited for efficiency
Nesting terms should be permitted and exploited
Nov 2, 2016
These changes modify topdown evaluation to use a binding list that
namespaces variables. This allows topdown to propagate partially ground
ref operands into child query evaluation.
These changes also prepare topdown evaluation to support a partial
evaluation mode.
With these changes, evaluation is no longer performed in two steps
(i.e., first pass of evaluating individual terms, second pass of
evaluating built-in expressions.) Instead, evaluation assumes queries
have been rewritten to eagerly evaluate refs and comprehension. This
way, ref and comprehension bindings do not have to be maintained
separately: they are handled by the normal variable binding list.
This commit contains some breaking changes to the topdown APIs,
namely...
1. Truth explanation has been removed. This feature was not used and the
tracing changes broke it. We can revisit in future if necessary.
2. Data indexing has been removed. Data indexing can be re-added in
future if necessary however it should be handled outside of topdown to
avoid potential memory leaks.
3. Built-in functions produce at-most-one output now. Functions that
used to produce multiple outputs (e.g., io.jwt.decode) can produce a
composite value if they need to.
Fixesopen-policy-agent#131
As I've been writing Rego recently I've tried to embed terms inside each other, which seems to not be allowed. Besides it being natural to write rules with embedded terms, embedding terms should allow for improved efficiency during evaluation. I've been defining virtual docs that have dictionaries/arrays as outputs (e.g. {id: ..., name: ...}), and then want to use those virtual docs to ask if one of the outputs has a particular value for one of its fields (e.g. name: "foo").
Here's a contrived example of how we do that today. Notice that when we query p, we are iterating over all the values of p and checking if the first element is an "a".
It would be more efficient if the fact that the first value in the output is an "a" is pushed down into the computation of p so that we only compute the slice of p where every array is of the form ["a", y]. Here is how I would imagine this working in the future. I embedded 3 comments with // explaining the difference with the trace above.
The text was updated successfully, but these errors were encountered: