Add print function #3319

anderseknert · 2021-03-26T10:15:51Z

One of the most asked questions on the OPA Slack is how to print values to console. While there are many, often better ways, of debugging Rego policies, the quick feedback loop and the zero ceremony offered by a print/println function is sometimes just the right tool for the job. Since such a function exists in most regular programming languages this is a development/debugging style adopted by many, and naturally they'll want to bring that to OPA and Rego. We should let them!

While there is some support for doing this via the trace built-in already, using it simply for printing values is kind of a hack, and having to learn all the peculiarities like required flags and output format around it as someone who is just learning Rego isn't a great introduction to the language or its tooling.

Some things to consider:

Name: print vs println, echo, others? 😄
Overloading (see Investigate whether function overloading can support different arities #1921), like print("test", x, 1, y).
Output: stdout vs. stderr and how to play nice with OPA logging / decision logging.
Allow client application to override behavior when using OPA as a library.
Deprecating trace.
(ecosystem) Rego playground routing print calls to console.log.
(possibly) how to deal with undefined values. Common source of confusion for beginners.

Related: #2957

The text was updated successfully, but these errors were encountered:

srenatus · 2021-03-26T10:25:25Z

💭 Does this issue supersede #2957?

srenatus · 2021-03-26T10:26:34Z

(ecosystem) Rego playground routing print calls to console.log.

Hrrm I'd rather have a "log output" pane in the playground, I think. Including the browser's dev tools as part of the UI doesn't seem right 😄 I suppose it could be collapsed by default, same as the data pane.

anderseknert · 2021-03-26T10:30:55Z

Hrrm I'd rather have a "log output" pane in the playground, I think. Including the browser's dev tools as part of the UI doesn't seem right 😄 I suppose it could be collapsed by default, same as the data pane.

Perhaps you are right :) As long as it's dealt with in some way!

yashtewari · 2021-03-30T15:45:28Z

Hi! Since the goal here is debugging, I thought it might be relevant to share a patch for this basic builtin our team has been using for debugging Rego running on our dev/staging OPA servers: yashtewari@040285a

log(object) simply adds an object to the decision log whenever it is evaluated, along with a timestamp and a code location. While this is different from the stated objective of printing to the console, it serves the use case of finding problems in Rego code that is "live" in a staging environment, for example.

The style/usage is analogous to the "print statement" style that you've described. Do you think an option to enrich the decision log can be added in a future print function?

We're happy to help and contribute!

tsandall · 2021-03-30T15:53:40Z

@yashtewari welcome back!

log(object) simply adds an object to the decision log whenever it is evaluated, along with a timestamp and a code location.

That's an interesting idea!

Do you think an option to enrich the decision log can be added in a future print function?

I think this relates to #1514 in that we need a way for values generated during evaluation to be included in the decision log.

We'll need to think about this a bit more.

anderseknert · 2021-03-30T16:29:26Z

Yeah, I like the idea of being able to enrich decisions with arbitrary data and I think a log function like that would be valuable. I don't think it necessarily needs to compete with the print built-in described here though. Ideally the print function should be as dumb as possible, requiring no configuration whatsoever and work in all kinds of environments where Rego runs (terminal, web browser, wasm, playground, etc).

yashtewari · 2021-03-31T07:04:20Z

Thanks a lot @tsandall!! And congratulations for OPA being cooler than ever 😁

I think this relates to #1514 in that we need a way for values generated during evaluation to be included in the decision log.

You mean in terms of implementation? Would be nice to see what you guys think about how log is done, maybe not in this thread. As I understand after looking at #1514, these are separate cases to include in the decision log -- 1) data generated during runtime that informs the decision and 2) arbitrary data that the policy author wants to specifically look at.

Ideally the print function should be as dumb as possible

Ah you're right, doing one thing well makes perfect sense. My thinking was that as a print debugger myself, I use log in just the same way that I would use print. Great to know that you like the idea, it's bare bones for now but we'll make it better over time. For example right now we haven't addressed caching behaviour, so there can be unnecessary evaluation and the decision log can get too big.

Thanks for your responses!

tsandall · 2021-09-14T16:28:40Z

Thinking about print support a bit more...

For the sake of progress:

Call the function/statement print.
Ignore formatting specifications for now (we could add another printf statement later if needed.)
Assume that embedders will append newlines to output (i.e., print isn't intended to be used to write arbitrary bytes to a stream... it's intended to be used to output/echo values so that humans can read them).
Assume that users will expect variadic argument support.
Ignore embedding concerns (stdout/stderr, etc.) for now (i.e., let's only deal with how print is handled by type system, compiler, planner, evaluator, etc. for now.)
Ignore trace() for now.

The main issue I see with print is how to handle failures. For example:

p {
  a := 1
  b := 2
  a == b  # if the body is evaluated in-order, this statement will FAIL and the next will never be reached
  print(a, b)
}

If you look at the rule body as an imperative sequence of statements (which is incorrect, but common) then you would expect to see some output because there are no conditional statements. Of course, in Rego, statements in the body of the rule are ANDed together and there is no ordering, so there is no need to execute subsequent statements once a failure is encountered.

If we assume that print arguments are treated the same way as other function calls, then rewriting could make this a common occurence:

p {
  print(input.foo)
}

Gets rewritten to:

p {
  _1 = input.foo
  print(_1)
}

If input.foo is undefined, print(_1) will never execute. Surely users will expect to see output from a rule like the second to last one above.

To deal with this situation, I propose we make print execute on FAIL and EXIT evaluation events. In other words, anytime the rule succeeds or fails, the print statement will be executed. Duplicate messages can be condensed for clarity. This leaves the treatment of undefined values. To deal with that, I propose we treat print like any other built-in function except after plugging the arguments, any unbound variables will be replaced with some value, for example:

p {
  print("the value of input.foo is", input.foo)
}

Given input as {} (i.e., "foo" is missing), the output would be:

the value of input.foo is <undefined>

timothyhinrichs · 2021-09-14T17:03:50Z

Seems like we're saying we want

no flattening of complex terms that are arguments to print()
print() to be a special function that evaluates even if its arguments are undefined. But the return value is undefined if any of its args are undefined.

Do we need some awareness of print() during body reordering so that the body can be reordered between print() statements but not across print() statements? Often people will want to use print() to know the location in the rule body that succeeded during execution. E.g. what do we do with this?

p {
   b := a.foo
   print("got here 1")
   a := input.bar
   print("got here 2")
}

anderseknert · 2021-09-14T18:03:37Z

@tsandall I totally agree on the problem of undefined values, and really like the idea of letting print still print on fail before exiting.
@timothyhinrichs under what conditions do we currently reorder the body? Is it safe to assume print is used for debugging only, and we can skip reordering if it's encountered in the body, to accommodate that use case you're describing?

timothyhinrichs · 2021-09-14T18:11:40Z

Example above is one where reordering will happen. We can't compute b until we compute a. OPA will reorder those rules so that it computes a and then b.

This is likely a corner-case, as I think more about it, because most people will write rules sensibly so they do not require reordering. Though if there are automated tools generating Rego code this could happen. Conceptually it would be fine if we ignored this case for now; however, if we later want to add restrictions that say reject rules like the one above then it would be a backwards incompatible change. Easier to relax restrictions than to add them.

anderseknert · 2021-09-14T18:27:40Z

Ah, gotcha! Completely missed the dependency there. At least to me it seems fine to just print in the order of evaluation in this case.

tsandall · 2021-09-14T21:44:06Z

Do we need some awareness of print() during body reordering so that the body can be reordered between print() statements but not across print() statements? Often people will want to use print() to know the location in the rule body that succeeded during execution. E.g. what do we do with this?

@timothyhinrichs this is interesting and it's not what I had in mind... I think there are two options:

As you described, print arguments can be undefined and print statements are undefined if any of the arguments are undefined. This way, evaluation stops after the print statement if any arguments are undefined. Ordering of the print statement is important. print statements can affect the outcome of evaluation.
print arguments can be undefined but print statements never affect evaluation. In other words, print statements are always executed when the rule exits or a failure occurs. Ordering of the print statement is not important.

If we go with (1) then print statements can be used to determine where evaluation stops. However, it can also cause evaluation to fail somewhat unexpectedly:

p {
  input.x == input.y
  printf("blah blah blah (z=%v)", input.z) # if input.z is undefined, p is undefined
}

Using print statements to discover where evaluation is stopping sounds useful but it's not exactly what I had in mind. What I had in mind was (2) because:

A) It gives users a quick-and-dirty alternative to inspect values during policy evaluation. Users don't have to care where the print statement is placed in the rule. Moreover, print statements can be completely ignored in production environments because they never affect the outcome of policy evaluation.

B) The implementation is quite simple because (i) arguments can be rewritten the same way as other calls (ii) there's no special reordering requirement for safety (iii) the evaluator just has to evaluate print statements on FAIL and EXIT events.

Any thoughts on this alternative option?

timothyhinrichs · 2021-09-18T00:23:53Z

Now I understand the original proposal. Love the idea of being able to ignore print()s semantically for prod. I suspected I was missing something. I'd guess that most users will be surprised that print ordering isn't respected. If we want to do that, it feels like we'd want a different syntax that's attached to the rule head somehow--perhaps something in the metadata for a rule? That would indicate more accurately what's going to happen.

Maybe the higher level bit is that over time we've added features that make reordering the rule body more surprising: := and == come to mind immediately. I would bet that most people would even expect to get compile errors if the rule body needs to be reordered for top-down eval. Reordering now feels like a perf optimization that's hidden from users, rather than something that users know about.

tsandall · 2021-10-14T18:37:08Z

Fixed by #3868

anderseknert added feature-request usability labels Mar 26, 2021

anderseknert mentioned this issue Jul 1, 2021

Log level debug #1971

Closed

tsandall changed the title ~~Built-ins: add print function~~ Add print function Sep 14, 2021

tsandall closed this as completed Oct 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add print function #3319

Add print function #3319

anderseknert commented Mar 26, 2021 •

edited

Loading

srenatus commented Mar 26, 2021

srenatus commented Mar 26, 2021

anderseknert commented Mar 26, 2021

yashtewari commented Mar 30, 2021 •

edited

Loading

tsandall commented Mar 30, 2021

anderseknert commented Mar 30, 2021 •

edited

Loading

yashtewari commented Mar 31, 2021 •

edited

Loading

tsandall commented Sep 14, 2021 •

edited

Loading

timothyhinrichs commented Sep 14, 2021

anderseknert commented Sep 14, 2021

timothyhinrichs commented Sep 14, 2021

anderseknert commented Sep 14, 2021

tsandall commented Sep 14, 2021

timothyhinrichs commented Sep 18, 2021

tsandall commented Oct 14, 2021

Add print function #3319

Add print function #3319

Comments

anderseknert commented Mar 26, 2021 • edited Loading

srenatus commented Mar 26, 2021

srenatus commented Mar 26, 2021

anderseknert commented Mar 26, 2021

yashtewari commented Mar 30, 2021 • edited Loading

tsandall commented Mar 30, 2021

anderseknert commented Mar 30, 2021 • edited Loading

yashtewari commented Mar 31, 2021 • edited Loading

tsandall commented Sep 14, 2021 • edited Loading

timothyhinrichs commented Sep 14, 2021

anderseknert commented Sep 14, 2021

timothyhinrichs commented Sep 14, 2021

anderseknert commented Sep 14, 2021

tsandall commented Sep 14, 2021

timothyhinrichs commented Sep 18, 2021

tsandall commented Oct 14, 2021

anderseknert commented Mar 26, 2021 •

edited

Loading

yashtewari commented Mar 30, 2021 •

edited

Loading

anderseknert commented Mar 30, 2021 •

edited

Loading

yashtewari commented Mar 31, 2021 •

edited

Loading

tsandall commented Sep 14, 2021 •

edited

Loading