Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nesting terms should be permitted and exploited #131

Closed
timothyhinrichs opened this issue Nov 2, 2016 · 0 comments
Closed

Nesting terms should be permitted and exploited #131

timothyhinrichs opened this issue Nov 2, 2016 · 0 comments

Comments

@timothyhinrichs
Copy link
Member

As I've been writing Rego recently I've tried to embed terms inside each other, which seems to not be allowed. Besides it being natural to write rules with embedded terms, embedding terms should allow for improved efficiency during evaluation. I've been defining virtual docs that have dictionaries/arrays as outputs (e.g. {id: ..., name: ...}), and then want to use those virtual docs to ask if one of the outputs has a particular value for one of its fields (e.g. name: "foo").

Here's a contrived example of how we do that today. Notice that when we query p, we are iterating over all the values of p and checking if the first element is an "a".

q["a"] = 1 :- true
q["b"] = 2 :- true
p[x] :- q[y] = z, x=[y,z]
p[x], x=["a", y]
Enter data.repl.p[x], eq(x, ["a", y])
| Eval data.repl.p[x]
| Enter p[x]
| | Eval eq(data.repl.q[y], z)
| | Enter q["a"] = 1
| | | Eval true
| | | Exit q["a"] = 1
| | Eval eq(x, ["a", 1])
| | Exit p[["a", 1]]
| Eval eq(["a", 1], ["a", y])
| Exit data.repl.p[x], eq(x, ["a", y])
Redo data.repl.p[x], eq(x, ["a", y])
| Redo true
| Redo p[["a", 1]]
| | Redo eq(1, z)
| | Redo q["b"] = 2
| | | Eval true
| | | Exit q["b"] = 2
| | Eval eq(x, ["b", 2])
| | Exit p[["b", 2]]
| Eval eq(["b", 2], ["a", y])
| Fail eq(["b", 2], ["a", y])
+---------+---+
| x | y |
+---------+---+
| ["a",1] | 1 |
+---------+---+

It would be more efficient if the fact that the first value in the output is an "a" is pushed down into the computation of p so that we only compute the slice of p where every array is of the form ["a", y]. Here is how I would imagine this working in the future. I embedded 3 comments with // explaining the difference with the trace above.

q["a"] = 1 :- true
q["b"] = 2 :- true
p[[y,z]] :- q[y] = z // moved x=[y,z] into the head of the rule
p[["a", y]] // moved x=["a", y] into head of query
Enter data.repl.p[["a", y]]
| Eval data.repl.p[["a", y]]
| Enter p[["a", y]]
| | Eval eq(data.repl.q["a"], z)
| | Enter q["a"] = 1
| | | Eval true
| | | Exit q["a"] = 1
| | Eval eq(x, ["a", 1])
| | Exit p[["a", 1]]
| Eval eq(["a", 1], ["a", y])
| Exit data.repl.p[x], eq(x, ["a", y])
// used index to lookup just the q["a"] values, so not scanning thru all q data
Redo data.repl.p[x], eq(x, ["a", y])
| Redo true
| Redo p[["a", 1]]
| | Redo q["a"] = 1
| | Fail
| Fail data.repl.p[x], eq(x, ["a", y])
+---------+---+
| x | y |
+---------+---+
| ["a",1] | 1 |
+---------+---+

@timothyhinrichs timothyhinrichs changed the title Nesting terms should be permitted and exploited for efficiency Nesting terms should be permitted and exploited Nov 2, 2016
tsandall added a commit to tsandall/opa that referenced this issue Nov 9, 2017
These changes modify topdown evaluation to use a binding list that
namespaces variables. This allows topdown to propagate partially ground
ref operands into child query evaluation.

These changes also prepare topdown evaluation to support a partial
evaluation mode.

With these changes, evaluation is no longer performed in two steps
(i.e., first pass of evaluating individual terms, second pass of
evaluating built-in expressions.) Instead, evaluation assumes queries
have been rewritten to eagerly evaluate refs and comprehension. This
way, ref and comprehension bindings do not have to be maintained
separately: they are handled by the normal variable binding list.

This commit contains some breaking changes to the topdown APIs,
namely...

1. Truth explanation has been removed. This feature was not used and the
tracing changes broke it. We can revisit in future if necessary.

2. Data indexing has been removed. Data indexing can be re-added in
future if necessary however it should be handled outside of topdown to
avoid potential memory leaks.

3. Built-in functions produce at-most-one output now. Functions that
used to produce multiple outputs (e.g., io.jwt.decode) can produce a
composite value if they need to.

Fixes open-policy-agent#131
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant