ast: support dotted heads #4660

srenatus · 2022-05-04T12:27:53Z

With this change, we'll be able to define substructures using refs in rules. Now, we can do this in one file,

foo.rego:

package basket
small_numbers := {1, 2, 3}

fruit.apple.color = "green" { input in small_numbers }

and it will be the same as these two files:

foo.rego:

package basket

small_numbers := {1, 2, 3}

and apple.rego:

package basket.fruit.apple

color = "green" { input in data.basket.small_numbers }

Fixes #4653.

new TODOs 🏗️

resolve refs to dotted rules
formatter tests and adaptations
more docs

ast/annotations_test.go

carabasdaniel

this is a nice addition 👍

johanfylling

👍

johanfylling · 2022-05-17T09:45:50Z

ast/compile.go

+				} else if v, ok := ref[0].Value.(Var); ok {
+					ref = Ref{StringTerm(string(v))}.Concat(ref[1:])
+				}
+				if childNode := node.find(ref); childNode != nil {
 					for _, childMod := range childNode.Modules {
 						msg := fmt.Sprintf("%v conflicts with rule defined at %v", childMod.Package, rule.Loc())
 						c.err(NewError(TypeErr, mod.Package.Loc(), msg))


Should this not point to where the conflicting package was declared? I.e.: childMod.Package.Loc().

I think this is correct: the compiler is complaining about mod, not about childMod.

Is mod part of the complaint at all? Is it not rule that is in conflict with childMod, and rule just happens to be declared in mod? If so, then shouldn't the error location be either that of rule or childMod.Package, and if we complain about rule, then won't all errors be exactly the same on multiple conflicts?
I might be confused, though 😅 .

ast/compile.go

johanfylling · 2022-05-17T09:57:31Z

ast/compile_test.go

+				"data.p.a.b.c": {"r", "s", "y"},
+				"data.p.a.b":   {"x"},
+			},
+		},


How about a test case where the rule path's overlap

package a.b.c r = 1

package a b.c.r = 2

Or perhaps this is covered somewhere else?

I'm having a hard time linking stuff in this giant PR, but look for this below:

note: "simple: two modules, one using ref head, one package path",

johanfylling · 2022-05-17T10:08:44Z

ast/compile.go

 	}

 	return rules
 }

+func hashMapAdd(rules *util.HashMap, ref Ref, rvs []Var) {


Should we also drop duplicate rule names (rvs entries) here, or that is dealt with elsewhere (or doesn't matter at all)?

ast/compile.go

johanfylling · 2022-05-17T11:19:54Z

ast/parser_ext.go

+	} else if r, ok := lhs.Value.(Ref); ok { // groundness ?
+		if len(r) == 2 {
+			// NOTE(sr): interpret this as dot-less partial set rule
+			return nil, fmt.Errorf("TODO clean up error handling")


Work still to be done here?

Ah right. It'll never surface, but I'll clean up the name.

johanfylling · 2022-05-17T11:46:50Z

ast/parser_ext.go

+		return nil, fmt.Errorf("%vs cannot be used as rule name", TypeName(r[0].Value))
+	}
+
+	// Ref: set only if we've got 2+ dots in the ref


I'm still a bit fuzzy on why we require more than 2 dots.

if there's only one dot, we've got to interpret it as set rule,

foo.bar { true }

is a "shortcut" for

foo["bar"] { true }

for backwards-compatibility.

Right, that makes sense. For some reason, I read this as there needs to be more than 2 dots .. which doesn't track with the subsequent logic 🙄

I was wrong here:

foo.bar

is the only shortcut we've got to support.

foo.bar { true }

this never worked. However, it might be the case that we aren't able to disambiguate that at this code location, I'm still getting back into this...

johanfylling · 2022-05-17T12:08:42Z

ast/policy_test.go

 	assertHeadsEqual(t, &Head{Name: Var("p")}, &Head{Name: Var("p")})
+	assertHeadsEqual(t, &Head{Ref: Ref{VarTerm("p"), StringTerm("r")}}, &Head{Ref: Ref{VarTerm("p"), StringTerm("r")}}) // TODO: string for first section


Is StringTerm something we still want to support for the first term?

I've changed my mind 😅

johanfylling · 2022-05-17T12:21:17Z

test/cases/testdata/refheads/test-refs-as-rule-heads.yaml

+
+    p {
+      q == 0
+      data.test.a.b.c.p == 1


given that q can be accessed directly, without the data.test. prefix, within the same module; it's easy to assume that a.b.c.p could be too. But this is not the case, right?

But this is not the case, right?

Exactly. I think this is in line with how we deal with multiple modules: when there's a mod with package a.b.c and one with in package a, the latter's rego can't use b.c.d to reference rule d in package a.b.c either. But we could discuss this decision!

IMO, it makes sense to be able to directly reference something in the same module/file as where it was declared. I can't say for certain, but I think a lot of people would try to ref a.b.c.p before they tried data.test.a.b.c.p.

Someone could of course declare an object in that file a.b := {"c": {"p": 42}}, which would make q := a.b.c.p ambiguous .. But I don't think the data.<module package> prefix of calling q := data.test.a.b.c.p instead actually helps with that, right?

I'll double-check tomorrow, but some cases are captured by the conflict check.

You would be able to do that if you did

package test import data.test.a a.b.c.p = 1 p { a.b.c.p == 1 }

...but this never occurred to me before. I'll have to see if it works or just breaks. 🔍

Works as-is:

$ opa eval -fpretty -d x.rego data { "test": { "a": { "b": { "c": { "p": 1 } } }, "p": true } }```

My expectation was that a would be visible inside of package test just like any other rule/variable. I realize that the implementation means it doesn't work this way but I think we should change that.

package test # these should behave the same x := {"foo": {"bar": {"baz": 7}}} y["foo"] := {"bar": {"baz": 7}} z.foo.bar.baz := 7 p { x.foo.bar.baz == 7 y.foo.bar.baz == 7 z.foo.bar.baz == 7 # currently you can't say this, but IMO, that's confusing }

I'd rather not require users add an import for this to work. I'll take a look at the implementation.

👍 makes sense. Also the same is true for x.foo.bar and x.foo and x (y, z respectively), where the scope of resolving z.foo or z.foo.bar would be broader, as it might include other files' and modules' contributions to the tree.

tsandall · 2022-05-20T22:34:55Z

The format package seems to be mishandling rule names that are refs:

torin:~/scratch/dotted-heads$ cat ~/x.rego
package x

foo.bar.baz := 7
torin:~/scratch/dotted-heads$ opa fmt ~/x.rego
package x

baz := 7

tsandall · 2022-05-21T16:15:32Z

docs/content/policy-reference.md

+For reasons of backwards-compatibility, partial sets containing _strings_ need
+to use variables, i.e.
+
+```live:rules/ref_heads/ps_complete:module:read_only
+fruit.box["apples"] { true }
+```
+
+defines a _complete document rule_ `fruit.box.apples` with value `true`. To define
+a partial set rule with a reference head, use
+
+```live:rules/ref_heads/ps_set_of_strings:module:read_only
+fruit.box[x] { x := "apples" }
+```


I hadn't thought about the ambiguity between partial sets and the new syntax. I think we need to come up with a different solution to this because I would not want to preclude us from supporting variables in the refs in the future... e.g., foo.bar[x].baz[y] { ... } should define a nested object structure:

foo = { "bar": {x0: {"baz": {y0: true, y1: true, ..., yN: true}, x1: {...}, ..., xM: {...}}} }

srenatus · 2022-06-21T12:49:51Z

E_TOOMANYPRS 😉 I'll re-open this one when the preliminary stuff is in: #4776.

Nothing consequential here, just changing the structure of the tests. Now, they each only have the imports they need, and locals for each expectation start from zero. Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>

To know that it's the right path, we need to look up the rule refs before consulting the indexer. We'll do that by using evalVirtual, and using the rule tree for checking head ref groundness. Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>

This change allows rules to have string prefixes in their heads -- we've come to call them "ref heads". String prefixes means that where before, you had package a.b.c allow = true you can now have package a b.c.allow = true This allows for more concise policies, and different ways to structure larger rule corpuses. Backwards-compatibility: - There are code paths that accept ast.Module structs that don't necessarily come from the parser -- so we're backfilling the rule's Head.Reference field from the Name when it's not present. This is exposed through (Head).Ref() which always returns a Ref. This also affects the `opa parse` "pretty" output: With x.rego as package x import future.keywords a.b.c.d if true e[x] if true we get $ opa parse x rego module package ref data "x" import ref future "keywords" rule head ref a "b" "c" "d" true body expr index=0 true rule head ref e x true body expr index=0 true Note that Name: e Key: x becomes Reference: e[x] in the output above (since that's how we're parsing it, back-compat edge cases aside) - One special case for backcompat is `p[x] { ... }`: rule | ref | key | value | name ------------------------+-------+-----+-------+----- p[x] { ... } | p | x | nil | "p" p contains x if { ... } | p | x | nil | "p" p[x] if { ... } | p[x] | nil | true | "" For interpreting a rule, we now have the following procedure: 1. if it has a Key, it's a multi-value rule; and its Ref defines the set: Head{Key: x, Ref: p} ~> p is a set ^-- we'd get this from `p contains x if true` or `p[x] { true }` (back compat) 2. if it has a Value, it's a single-value rule; its Ref may contain vars: Head{Ref: p.q.r[s], Value: 12} ~> body determines s, `p.q.r.[s]` is 12 ^-- we'd get this from `p.q.r[s] = 12 { s := "whatever" }` Head{Key: x, Ref: p[x], Value: 3} ~> `p[x]` has value 3, `x` is determined by the rule body ^-- we'd get this from `p[x] = 3 if x := 2` or `p[x] = 3 { x := 2 }` (back compat) Here, the Key isn't used, it's present for backwards compatibility: for ref- less rule heads, `p[x] = 3` used to be a partial object: key x, value 3, name "p" - The destinction between complete rules and partial object rules disappears. They're both single-value rules now. - We're now outputting the refs of the rules completely in error messages, as it's hard to make sense of "rule r" when there's rule r in package a.b.c and rule b.c.r in package a. Restrictions/next steps: - Support for ref head rules in the REPL is pretty poor so far. Anything that works does so rather accidentally. You should be able to work with policies that contain ref heads, but you cannot interactively define them. This is because before, we'd looked at REPL input like p.foo.bar = true and noticed that it cannot be a rule, so it's got to be a query. This is no longer the case with ref heads. - Currently vars in Refs are only allowed in the last position. This is expected to change in the future. - Also, for multi-value rules, we can not have a var at all -- so the following isn't supported yet: p.q.r[s] contains t if { ... } ----- Most of the work happens when the RuleTree is derived from the ModuleTree -- in the RuleTree, it doesn't matter if a rule was `p` in `package a.b.c` or `b.c.p` in `package a`. As such, the planner and wasm compiler hasn't seen that many adaptations: - We're putting rules into the ruletree _including_ the var parts, so p.q.a = 1 p.q.[x] = 2 { x := "b" } end up in two different leaves: p `-> q `-> a = 1 `-> [x] = 2` - When planing a ref, we're checking if a rule tree node's children have var keys, and plan "one level higher" accordingly: Both sets of rules, p.q.a and p.q[x] will be planned into one function (same as before); and accordingly return an object {"a": 1, "b": 2} - When we don't have vars in the last ref part, we'll end up planning the rules separately. This will have an effect on the IR. p.q = 1 p.r = 2 Before, these would have been one function; now, it's two. As a result, in Wasm, some "object insertion" conflicts can become "var assignment conflicts", but that's in line with the now-new view of "multi-value" and "single-value" rules, not partial {set/obj} vs complete. * planner: only check ref.GroundPrefix() for optimizations In a previous commit, we've only mapped p.q.r[7] as p.q.r; and as such, also need to lookup the ref p.q.r[__local0__] via p.q.r (I think. Full disclosure: there might be edge cases here that are unaccounted for, but right now, I'm aiming for making the existing tests green...) New compiler stage: In the compiler, we're having a new early rewriting step to ensure that the RuleTree's keys are comparible. They're ast.Value, but some of them cause us grief: - ast.Object cannot be compared structurally; so _, ok := map[ast.Value]bool{ast.NewObject([2]*ast.Term{ast.StringTerm("foo"), ast.StringTerm("bar")}): true}[ast.NewObject([2]*ast.Term{ast.StringTerm("foo"), ast.StringTerm("bar")})] `ok` will never be true here. - ast.Ref is a slice type, not hashable, so adding that to the RuleTree would cause a runtime panic: p[y.z] { y := input } is now rewritten to p[__local0__] { y := input; __local0__ := y.z } This required moving the InitLocalVarGen stage up the chain, but as it's still below ResolveRefs, we should be OK. As a consequence, we've had to adapt `oracle` to cope with that rewriting: 1. The compiler rewrites rule head refs early because the rule tree expects only simple vars, no refs, in rule head refs. So `p[x.y]` becomes `p[local] { local = x.y }` 2. The oracle circles in on the node it's finding the definition for based on source location, and the logic for doing that depends on unaltered modules. So here, (2.) is relaxed: the logic for building the lookup node stack can now cope with generated statements that have been appended to the rule bodies. There is a peculiarity about ref rules and extents: See the added tests: having a ref rule implies that we get an empty object in the full extent: package p foo.bar if false makes the extent of data.p: {"foo": {}} This is somewhat odd, but also follows from the behaviour we have right now with empty modules: package p.foo bar if false this also gives data.p the extent {"foo": {}}. This could be worked around by recording, in the rule tree, when a node was added because it's an intermediary with no values, but only children. Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com> Signed-off-by: Byron Lagrone <byron.lagrone@seqster.com>

With the introduction of ref heads in open-policy-agent#4660, the planned IR still mostly worked, but it was bypassing the CallDynamic optimization when it shouldn't have. This commit re-works some of the rule planning to more robustly handle ref heads. Also adds a few test cases to get a grip on what should and should not happen. Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>

With the introduction of ref heads in #4660, the planned IR still mostly worked, but it was bypassing the CallDynamic optimization when it shouldn't have. This commit re-works some of the rule planning to more robustly handle ref heads. Also adds a few test cases to get a grip on what should and should not happen. Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>

This commit fixes a panic from a utility function in the `opa test` codepath. After the ref-heads change in open-policy-agent#4660, this particular function could be fed a ref that it didn't know how to work with, such as the innocuous line `a[0] := 1`, and it would then panic. This was fixed by returning a dummy value instead of panic'ing. Fixes: open-policy-agent#5496 Signed-off-by: Philip Conrad <philipaconrad@gmail.com>

This commit fixes a panic from a utility function in the `opa test` codepath. After the ref-heads change in #4660, this particular function could be fed a ref that it didn't know how to work with, such as from the innocuous line `a[0] := 1`, and it would then panic. This was fixed by returning a dummy value instead of panic'ing. Fixes: #5496 Signed-off-by: Philip Conrad <philipaconrad@gmail.com>

…gent#5497) This commit fixes a panic from a utility function in the `opa test` codepath. After the ref-heads change in open-policy-agent#4660, this particular function could be fed a ref that it didn't know how to work with, such as from the innocuous line `a[0] := 1`, and it would then panic. This was fixed by returning a dummy value instead of panic'ing. Fixes: open-policy-agent#5496 Signed-off-by: Philip Conrad <philipaconrad@gmail.com> (cherry picked from commit a08934d)

This commit fixes a panic from a utility function in the `opa test` codepath. After the ref-heads change in #4660, this particular function could be fed a ref that it didn't know how to work with, such as from the innocuous line `a[0] := 1`, and it would then panic. This was fixed by returning a dummy value instead of panic'ing. Fixes: #5496 Signed-off-by: Philip Conrad <philipaconrad@gmail.com> (cherry picked from commit a08934d)

srenatus force-pushed the sr/ast/dots-in-the-head branch 5 times, most recently from 5ab50c2 to dfdd89e Compare May 10, 2022 09:42

srenatus commented May 10, 2022

View reviewed changes

ast/annotations_test.go Show resolved Hide resolved

srenatus force-pushed the sr/ast/dots-in-the-head branch 2 times, most recently from d82a13a to bc01cd0 Compare May 12, 2022 12:34

srenatus marked this pull request as ready for review May 12, 2022 12:51

carabasdaniel previously approved these changes May 13, 2022

View reviewed changes

srenatus mentioned this pull request May 16, 2022

docs: Grammar definition does not allow assignment or unification in query #4691

Closed

johanfylling previously approved these changes May 17, 2022

View reviewed changes

srenatus dismissed stale reviews from johanfylling and carabasdaniel via 7824d1e May 18, 2022 09:34

srenatus force-pushed the sr/ast/dots-in-the-head branch from bc01cd0 to 7824d1e Compare May 18, 2022 09:34

tsandall reviewed May 21, 2022

View reviewed changes

srenatus mentioned this pull request Jun 9, 2022

ast/compile: fix util.HashMap eq comparison #4759

Merged

srenatus marked this pull request as draft June 9, 2022 12:31

srenatus closed this Jun 21, 2022

srenatus reopened this Jun 22, 2022

srenatus force-pushed the sr/ast/dots-in-the-head branch 4 times, most recently from 422bbc9 to f3328bf Compare June 28, 2022 20:00

srenatus force-pushed the sr/ast/dots-in-the-head branch 3 times, most recently from d5e1e19 to 5102cff Compare July 1, 2022 09:10

srenatus force-pushed the sr/ast/dots-in-the-head branch from 5102cff to 9d9b856 Compare July 14, 2022 07:50

srenatus added 6 commits October 10, 2022 11:05

ast/compile_test: rewrite TestCompilerRewriteTermsInHead

d2546d9

Nothing consequential here, just changing the structure of the tests. Now, they each only have the imports they need, and locals for each expectation start from zero. Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>

ast/parser: ensure invariant: genuine ref rules have no (Head).Name

6e42c8a

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>

ast/policy: resolve TODO, (Head).String() for MV rules

dce3eca

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>

ast/parser_test: adjust rebase

e84705b

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>

ast/compile: rewrite so that objects don't become rule tree keys

b5e6177

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>

srenatus force-pushed the sr/ast/dots-in-the-head branch from f9fa5f6 to b5e6177 Compare October 10, 2022 09:05

srenatus added 2 commits October 13, 2022 15:28

Merge branch 'main' into sr/ast/dots-in-the-head

fd452ec

Merge branch 'main' into sr/ast/dots-in-the-head

5934efa

srenatus merged commit 965301f into open-policy-agent:main Oct 14, 2022

srenatus deleted the sr/ast/dots-in-the-head branch October 14, 2022 08:16

This was referenced Oct 14, 2022

ref heads: generic refs #5247

Closed

policy files: make package optional #5248

Open

planner: non-string ref keys break #5252

Closed

srenatus mentioned this pull request Nov 28, 2022

planner: fix ref heads processing #5418

Merged

philipaconrad mentioned this pull request Dec 21, 2022

tester/runner: Fix panic'ing case in utility function. #5497

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ast: support dotted heads #4660

ast: support dotted heads #4660

srenatus commented May 4, 2022 •

edited

Loading

carabasdaniel left a comment

johanfylling left a comment

johanfylling May 17, 2022

srenatus May 18, 2022

johanfylling May 18, 2022

johanfylling May 17, 2022

srenatus Sep 23, 2022

johanfylling May 17, 2022

johanfylling May 17, 2022

srenatus May 17, 2022

johanfylling May 17, 2022

srenatus May 17, 2022 •

edited

Loading

johanfylling May 17, 2022

srenatus Jun 9, 2022

johanfylling May 17, 2022

srenatus May 17, 2022

johanfylling May 17, 2022

srenatus May 17, 2022

johanfylling May 17, 2022

srenatus May 17, 2022

srenatus May 17, 2022

srenatus May 18, 2022

tsandall May 21, 2022

srenatus May 23, 2022

tsandall commented May 20, 2022

tsandall May 21, 2022

srenatus commented Jun 21, 2022

		assertHeadsEqual(t, &Head{Name: Var("p")}, &Head{Name: Var("p")})
		assertHeadsEqual(t, &Head{Ref: Ref{VarTerm("p"), StringTerm("r")}}, &Head{Ref: Ref{VarTerm("p"), StringTerm("r")}}) // TODO: string for first section

ast: support dotted heads #4660

ast: support dotted heads #4660

Conversation

srenatus commented May 4, 2022 • edited Loading

new TODOs 🏗️

carabasdaniel left a comment

Choose a reason for hiding this comment

johanfylling left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

srenatus May 17, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tsandall commented May 20, 2022

Choose a reason for hiding this comment

srenatus commented Jun 21, 2022

srenatus commented May 4, 2022 •

edited

Loading

srenatus May 17, 2022 •

edited

Loading