Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow parameterized tests #2176

Open
anderseknert opened this issue Mar 9, 2020 · 19 comments
Open

Allow parameterized tests #2176

anderseknert opened this issue Mar 9, 2020 · 19 comments

Comments

@anderseknert
Copy link
Member

It would be very useful if the unit test system allowed for some type of simple "data driven" parameterization of tests where one of the mocked inputs could be parameterized to skip needless verbosity and possible copy-paste errors.

Expected Behavior

test_method_not_allowed[input.method] {
    not allow with input.method as ["POST", "DELETE", "PUT", "PATCH"]
}

(obviously made up example and syntax)

Actual Behavior

test_post_not_allowed {
    not allow with input.method as "POST"
}

test_delete_not_allowed {
    not allow with input.method as "DELETE"
}

test_put_not_allowed {
    not allow with input.method as "PUT"
}

test_patch_not_allowed {
    not allow with input.method as "PATCH"
}
@tsandall
Copy link
Member

You could do this with a helper rule or comprehension today. For example:

methods := {"POST", "DELETE", "PUT", "PATCH"}

test_method_not_allowed {
  disallowed_methods == methods
}

disallowed_methods[m] {
  some m
  methods[m]
  not allow with input.method as m
}

It would be nice to know which methods failed. If the test framework was a bit smarter it could report diffs on the variables inside the test rules. For example:

data.example.test_method_not_allowed: FAIL (1ms)
   {"DELETE", "PUT"} == {"POST", "DELETE", "PUT", "PATCH"}
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

It's important to use a helper rule or comprehension though--this would not be correct:

# THIS IS NOT CORRECT. The test will pass if SOME `m` satisfies the query (not EVERY `m`)
test_method_not_allowed {
  some m
  methods[m]
  not allow with input.method as m
}

Once we have the every keyword (#2090) we could avoid the helper/comprehension.

@anderseknert
Copy link
Member Author

Better visualization of diffs in assertions would indeed be great, and perhaps a feature request of its own :)

A notable difference between that and parameterization though - in your example you feed data into a single test and iterate over that within the test, whereas in "data driven" parameterized testing the data builds the tests, so given the example (and some way of annotating or marking a test as parameterized - in my made up example it was [input.method] after the test_ rule to not stray too far from established syntax) a parameterized test of 4 methods would really expand into 4 distinct tests and be shown as such also on success. This has the added benefit of knowing what is tested without 1) having a test failure and 2) reading the actual rego tests to see what is asserted within the test cases. So to carry on the example from before, given:

test_method_not_allowed[input.method] {
    not allow with input.method as ["POST", "DELETE", "PUT", "PATCH"]
}

Running opa test -v . would give something like:

data.http.test_method_not_allowed["POST"]: PASS (686.3µs)
data.http.test_method_not_allowed["DELETE"]: PASS (467.1µs)
data.http.test_method_not_allowed["PUT"]: PASS (322.8µs)
data.http.test_method_not_allowed["PATCH"]: PASS (219.3µs)
--------------------------------------------------------------------------------
PASS: 4/4

@anderseknert
Copy link
Member Author

Another case, albeit quite a different one, I've been looking at recently is piping kubernetes resource lists into conftest and the like (kubectl get deployments -o yaml | conftest test -) with a corresponding policy like the below:

deny[reason] {
    input.kind == "List"
    item := input.items[_]
    not item.metadata.labels.app
    reason := sprintf("%v does not have an app label", [item.metadata.name])
}

This works in that it will report all failures, but since it's from a single rule it will have no notion of tests that passed:

305 tests, 0 passed, 0 warnings, 305 failures, 0 exceptions

One alternative is of course to call conftest/OPA individually per object, but then that's gonna miss out on the total.. plus of course the overhead of starting those application hundreds of times.

If we could parameterize part of the input to be part of the reported rule name, each test could be counted. I should note though that this use case is still more in the "thinking out loud" department than the previous one.

@stale
Copy link

stale bot commented Nov 22, 2021

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days.

@stale stale bot added the inactive label Nov 22, 2021
@anderseknert
Copy link
Member Author

Circling back to this as I recently had a need for something like this again. BTW, the conftest issue mentioned above has been solved in conftest, and what I wanted there is now possible.

I think that a simple extension to the test framework allowing partial rules to be used would be the most elegant, Rego native, way of dealing with this:

test_method_not_allowed[method] {
    some method in ["POST", "DELETE", "PUT", "PATCH"]
    not allow with input.method as method
}

All that the test runner would need to keep track of is the method reference, and include that in the report on failures:

❯ opa test .
data.test.test_method_not_allowed["DELETE"]: FAIL (58.125µs)
--------------------------------------------------------------------------------
PASS: 3/4
FAIL: 1/4

As this is already how OPA deals with partial rules, it seems like it would be quite a simple addition 🤔

@stale stale bot removed the inactive label Jan 25, 2022
@stale
Copy link

stale bot commented Feb 24, 2022

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days.

@stale stale bot added the inactive label Feb 24, 2022
@c-thiel
Copy link

c-thiel commented Mar 18, 2022

Just if anybody else is running across this issue: This is now possible using the every keyword:
https://www.openpolicyagent.org/docs/latest/policy-language/#every-keyword

@stale stale bot removed the inactive label Mar 18, 2022
@anderseknert
Copy link
Member Author

While every is great, it does not help with this issue. Doing iteration over a list of values was possible already before every, but won't help with the rest of the details described here.

@tsandall
Copy link
Member

I think every could help a bit... for example:

test_method_not_supported {                     # test_method_not_supported is true if...
  every method in {"PUT", "PATCH", "DELETE"} {  #   for every method in ...
    not allow with input.method as method       #     not allow with ... is true
  }
}

If the condition attached to the every statement was false for ANY method, the test would fail--which would be identical to how it works in any other rule.

As far as providing better assertion failure reporting goes... I wonder how far we could get by plugging all vars on the failed expression and then eliding elements in large collections... this approach would be nice regardless of whether every is used... we should do a survey of real-world unit tests and see how far we'd get with substituting vars...

data.test.test_method_not_allowed: FAIL
  x.rego:5: not allow with input.method as "PATCH"
  x.rego:5: not allow with input.method as "DELETE"

@anderseknert
Copy link
Member Author

If the condition attached to the every statement was false for ANY method, the test would fail--which would be identical to how it works in any other rule.

Yeah - we shouldn't exit early in that case though as we'd want to know if there were more than one assertion that failed. But maybe we don't do that in tests.

The assertion failure reporting would be an improvement for sure!

@c-thiel
Copy link

c-thiel commented Mar 22, 2022

@anderseknert you are right, that part is still open.

I just think that #2176 (comment) is already a huge improvement over #2176 (comment).

@anderseknert
Copy link
Member Author

anderseknert commented Mar 22, 2022

Right - it's a cosmetic improvement for sure, but I'd argue that the part that is "still open" is the actual parameterization.

In some ways, every even makes this somewhat "worse", as the early exit optimization will ensure that evaluation will halt at the first failed condition, and there's no way to retrieve all the conditions that failed. If you'd want something like that you'd currently need to use a comprehension, and compare the result to the expected outcome.

EDIT - I just had another look at the original suggestion, and that too would be subject to early exit, so no difference in that vs. using every.

Using partial rules (as suggested before) for this would avoid early exit, and feels like an "OPA native" way of doing it. The problem with that approach is that we currently lack a way of differentiating success vs failures in a single "partial test", as we'd need a way to keep track of what's being iterated over, and count both failure conditions and successes.

With better assertion reporting, and disabling of early exit in tests, we'd get somewhere close to that using every as well. Might be confusing to have one behavior for tests vs. other policy though wrt early exit, but oh well, it's at least an alternative to consider :)

@stale
Copy link

stale bot commented Apr 21, 2022

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days.

@stale stale bot added the inactive label Apr 21, 2022
@marcaurele
Copy link

I double on what @anderseknert says about the parametrized tests, the early exit and the output of the param used for each run is indeed important to have that feature done. IMHO the every keyword is only a work around for now.

@marcaurele
Copy link

I described on https://github.com/orgs/open-policy-agent/discussions/514 our current issue trying to find a way to test all the rules possibilities to ensure business coverage of them. Having this parametrized feature was exactly what I was looking for.

@marcaurele
Copy link

@tsandall / @c-thiel there is a major problem using the every keyword in a single test. I have a test case with ~300 entries to go through the list iterated by the every keyword, making the test go easily over the timeout of 5s for the execution. This means I have to customize the timeout value, and increase it the more we add entries to that long list. It's problematic to keep the test suite healthy without extra maintenance.

SUMMARY
--------------------------------------------------------------------------------
/test/mab_test.rego:
data.authorization_test.test.test_mab_permitted: ERROR (5.001104217s)
  eval_cancel_error: caller cancelled query execution
data.authorization_test.test.test_mab_not_permitted: PASS (78.127613ms)

@stale stale bot removed the inactive label Nov 10, 2023
@anderseknert
Copy link
Member Author

anderseknert commented Nov 10, 2023

That seems like a bug @marcaurele 😅 could you provide a snippet showing what that code looks like?
(no need to provide all 300 cases, just one or two, and the logic where you run every)

@marcaurele
Copy link

@anderseknert I opened a new bug in order to not mix it up with this feature request: #6400

Copy link

stale bot commented Dec 10, 2023

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days. Although currently inactive, the issue could still be considered and actively worked on in the future. More details about the use-case this issue attempts to address, the value provided by completing it or possible solutions to resolve it would help to prioritize the issue.

@stale stale bot added the inactive label Dec 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants