Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --sample flag to eval running in profiler mode #3651

Closed
gabbifish opened this issue Jul 15, 2021 · 3 comments · Fixed by #3655
Closed

Add --sample flag to eval running in profiler mode #3651

gabbifish opened this issue Jul 15, 2021 · 3 comments · Fixed by #3655

Comments

@gabbifish
Copy link

Expected Behavior

opa eval --profile has been incredibly useful. Between multiple runs of the same opa eval --profile invocation, we've noticed variance of up to a factor of 2 between the same lines, and we would be interested in having sampling available to smooth out this variance.

Actual Behavior

Include a --sample flag for opa eval --profile, similar to the --sample flag for opa bench

Additional Info

@srenatus
Copy link
Contributor

similar to the --sample flag for opa bench

You got me confuse there...you probably mean --count, I think?

      --count int                      number of times to repeat each benchmark (default 1)

(That's from opa bench --help.)

@gabbifish
Copy link
Author

@srenatus ah, you're right----count is what I was referring to!

@srenatus
Copy link
Contributor

@gabbifish I've made a sketch for discussion purposes, #3655 -- I'd appreciate your input!

srenatus added a commit to srenatus/opa that referenced this issue Jul 21, 2021
`opa eval --count=# --profile` will now evaluate the query # times,
and display aggregated results for metrics and the expression-time
profile.

Before, `opa eval --profile` would yield output like this (JSON):

    "metrics": {
      "timer_rego_external_resolve_ns": 233,
      "timer_rego_load_files_ns": 138866,
      "timer_rego_module_compile_ns": 432654,
      "timer_rego_module_parse_ns": 81454,
      "timer_rego_query_compile_ns": 54892,
      "timer_rego_query_eval_ns": 135624,
      "timer_rego_query_parse_ns": 3886
    },
    "profile": [
      {
        "total_time_ns": 75705,
        "num_eval": 4,
        "num_redo": 2,
        "location": {
          "file": "t.rego",
          "row": 8,
          "col": 2
        }
      },

Now, both of these are replaced by "aggragated_" variants that
include some statistics:

    "aggregated_metrics": {
      "timer_rego_external_resolve_ns": {
        "75%": 1618.5,
        "90%": 1933.8000000000002,
        "95%": 1954,
        "99%": 1954,
        "99.9%": 1954,
        "99.99%": 1954,
        "count": 10,
        "max": 1954,
        "mean": 1137.1,
        "median": 1140,
        "min": 311,
        "stddev": 514.9278493148337
      },

and

    "aggregated_profile": [
      {
        "total_time_ns_stats": {
          "75%": 63369.75,
          "90%": 69523.5,
          "95%": 69691,
          "99%": 69691,
          "99.9%": 69691,
          "99.99%": 69691,
          "count": 10,
          "max": 69691,
          "mean": 39030.9,
          "median": 53758.5,
          "min": 3390,
          "stddev": 27635.790954666016
        },
        "num_eval": 1,
        "num_redo": 1,
        "location": {
          "file": "t.rego",
          "row": 9,
          "col": 2
        }
      }

The table format has been adjusted as well, and only displays a subset of
the statistics gathered:
Especially the 99.9 and 99.99 percentiles hardly matter for the sample
sizes I'd expect to be used here: they will almost certainly match "max".

The output is less unwieldy now:

    +------------------------------+---------+----------+---------------+----------------+---------------+
    |            METRIC            |   MIN   |   MAX    |     MEAN      |      90%       |      99%      |
    +------------------------------+---------+----------+---------------+----------------+---------------+
    | timer_rego_load_files_ns     | 349969  | 2549399  | 1.4760619e+06 | 2.5312689e+06  | 2.549399e+06  |
    | timer_rego_module_compile_ns | 1087507 | 24537496 | 1.120074e+07  | 2.41699473e+07 | 2.4537496e+07 |
    | timer_rego_module_parse_ns   | 275531  | 1915263  | 1.126406e+06  | 1.9016968e+06  | 1.915263e+06  |
    | timer_rego_query_compile_ns  | 61663   | 64395    | 63062.5       | 64374.1        | 64395         |
    | timer_rego_query_eval_ns     | 161812  | 1198092  | 637754        | 1.1846622e+06  | 1.198092e+06  |
    | timer_rego_query_parse_ns    | 6078    | 6078     | 6078          | 6078           | 6078          |
    +------------------------------+---------+----------+---------------+----------------+---------------+
    +----------+-------------+-------------+-------------+-------------+----------+----------+-----------------+
    |   MIN    |     MAX     |    MEAN     |     90%     |     99%     | NUM EVAL | NUM REDO |    LOCATION     |
    +----------+-------------+-------------+-------------+-------------+----------+----------+-----------------+
    | 43.875µs | 26.135469ms | 11.494512ms | 25.746215ms | 26.135469ms | 1        | 1        | data.rbac.allow |
    | 21.478µs | 211.461µs   | 98.102µs    | 205.72µs    | 211.461µs   | 1        | 1        | rbac.rego:13    |
    | 19.652µs | 123.537µs   | 73.161µs    | 122.75µs    | 123.537µs   | 1        | 1        | rbac.rego:40    |
    | 12.303µs | 117.277µs   | 61.59µs     | 116.733µs   | 117.277µs   | 2        | 1        | rbac.rego:50    |
    | 12.224µs | 93.214µs    | 51.289µs    | 92.217µs    | 93.214µs    | 1        | 1        | rbac.rego:44    |
    | 5.561µs  | 84.121µs    | 43.002µs    | 83.469µs    | 84.121µs    | 1        | 1        | rbac.rego:51    |
    | 5.56µs   | 71.712µs    | 36.545µs    | 71.158µs    | 71.712µs    | 1        | 0        | rbac.rego:45    |
    | 4.958µs  | 66.04µs     | 33.161µs    | 65.636µs    | 66.04µs     | 1        | 2        | rbac.rego:49    |
    | 4.326µs  | 65.836µs    | 30.461µs    | 65.083µs    | 65.836µs    | 1        | 1        | rbac.rego:6     |
    | 3.948µs  | 43.399µs    | 24.167µs    | 43.055µs    | 43.399µs    | 1        | 2        | rbac.rego:55    |
    +----------+-------------+-------------+-------------+-------------+----------+----------+-----------------+

Fixes open-policy-agent#3651.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
srenatus added a commit that referenced this issue Jul 21, 2021
`opa eval --count=# --profile` will now evaluate the query # times,
and display aggregated results for metrics and the expression-time
profile.

Before, `opa eval --profile` would yield output like this (JSON):

    "metrics": {
      "timer_rego_external_resolve_ns": 233,
      "timer_rego_load_files_ns": 138866,
      "timer_rego_module_compile_ns": 432654,
      "timer_rego_module_parse_ns": 81454,
      "timer_rego_query_compile_ns": 54892,
      "timer_rego_query_eval_ns": 135624,
      "timer_rego_query_parse_ns": 3886
    },
    "profile": [
      {
        "total_time_ns": 75705,
        "num_eval": 4,
        "num_redo": 2,
        "location": {
          "file": "t.rego",
          "row": 8,
          "col": 2
        }
      },

Now, both of these are replaced by "aggragated_" variants that
include some statistics:

    "aggregated_metrics": {
      "timer_rego_external_resolve_ns": {
        "75%": 1618.5,
        "90%": 1933.8000000000002,
        "95%": 1954,
        "99%": 1954,
        "99.9%": 1954,
        "99.99%": 1954,
        "count": 10,
        "max": 1954,
        "mean": 1137.1,
        "median": 1140,
        "min": 311,
        "stddev": 514.9278493148337
      },

and

    "aggregated_profile": [
      {
        "total_time_ns_stats": {
          "75%": 63369.75,
          "90%": 69523.5,
          "95%": 69691,
          "99%": 69691,
          "99.9%": 69691,
          "99.99%": 69691,
          "count": 10,
          "max": 69691,
          "mean": 39030.9,
          "median": 53758.5,
          "min": 3390,
          "stddev": 27635.790954666016
        },
        "num_eval": 1,
        "num_redo": 1,
        "location": {
          "file": "t.rego",
          "row": 9,
          "col": 2
        }
      }

The table format has been adjusted as well, and only displays a subset of
the statistics gathered:
Especially the 99.9 and 99.99 percentiles hardly matter for the sample
sizes I'd expect to be used here: they will almost certainly match "max".

The output is less unwieldy now:

    +------------------------------+---------+----------+---------------+----------------+---------------+
    |            METRIC            |   MIN   |   MAX    |     MEAN      |      90%       |      99%      |
    +------------------------------+---------+----------+---------------+----------------+---------------+
    | timer_rego_load_files_ns     | 349969  | 2549399  | 1.4760619e+06 | 2.5312689e+06  | 2.549399e+06  |
    | timer_rego_module_compile_ns | 1087507 | 24537496 | 1.120074e+07  | 2.41699473e+07 | 2.4537496e+07 |
    | timer_rego_module_parse_ns   | 275531  | 1915263  | 1.126406e+06  | 1.9016968e+06  | 1.915263e+06  |
    | timer_rego_query_compile_ns  | 61663   | 64395    | 63062.5       | 64374.1        | 64395         |
    | timer_rego_query_eval_ns     | 161812  | 1198092  | 637754        | 1.1846622e+06  | 1.198092e+06  |
    | timer_rego_query_parse_ns    | 6078    | 6078     | 6078          | 6078           | 6078          |
    +------------------------------+---------+----------+---------------+----------------+---------------+
    +----------+-------------+-------------+-------------+-------------+----------+----------+-----------------+
    |   MIN    |     MAX     |    MEAN     |     90%     |     99%     | NUM EVAL | NUM REDO |    LOCATION     |
    +----------+-------------+-------------+-------------+-------------+----------+----------+-----------------+
    | 43.875µs | 26.135469ms | 11.494512ms | 25.746215ms | 26.135469ms | 1        | 1        | data.rbac.allow |
    | 21.478µs | 211.461µs   | 98.102µs    | 205.72µs    | 211.461µs   | 1        | 1        | rbac.rego:13    |
    | 19.652µs | 123.537µs   | 73.161µs    | 122.75µs    | 123.537µs   | 1        | 1        | rbac.rego:40    |
    | 12.303µs | 117.277µs   | 61.59µs     | 116.733µs   | 117.277µs   | 2        | 1        | rbac.rego:50    |
    | 12.224µs | 93.214µs    | 51.289µs    | 92.217µs    | 93.214µs    | 1        | 1        | rbac.rego:44    |
    | 5.561µs  | 84.121µs    | 43.002µs    | 83.469µs    | 84.121µs    | 1        | 1        | rbac.rego:51    |
    | 5.56µs   | 71.712µs    | 36.545µs    | 71.158µs    | 71.712µs    | 1        | 0        | rbac.rego:45    |
    | 4.958µs  | 66.04µs     | 33.161µs    | 65.636µs    | 66.04µs     | 1        | 2        | rbac.rego:49    |
    | 4.326µs  | 65.836µs    | 30.461µs    | 65.083µs    | 65.836µs    | 1        | 1        | rbac.rego:6     |
    | 3.948µs  | 43.399µs    | 24.167µs    | 43.055µs    | 43.399µs    | 1        | 2        | rbac.rego:55    |
    +----------+-------------+-------------+-------------+-------------+----------+----------+-----------------+

Fixes #3651.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
dolevf pushed a commit to dolevf/opa that referenced this issue Nov 4, 2021
`opa eval --count=# --profile` will now evaluate the query # times,
and display aggregated results for metrics and the expression-time
profile.

Before, `opa eval --profile` would yield output like this (JSON):

    "metrics": {
      "timer_rego_external_resolve_ns": 233,
      "timer_rego_load_files_ns": 138866,
      "timer_rego_module_compile_ns": 432654,
      "timer_rego_module_parse_ns": 81454,
      "timer_rego_query_compile_ns": 54892,
      "timer_rego_query_eval_ns": 135624,
      "timer_rego_query_parse_ns": 3886
    },
    "profile": [
      {
        "total_time_ns": 75705,
        "num_eval": 4,
        "num_redo": 2,
        "location": {
          "file": "t.rego",
          "row": 8,
          "col": 2
        }
      },

Now, both of these are replaced by "aggragated_" variants that
include some statistics:

    "aggregated_metrics": {
      "timer_rego_external_resolve_ns": {
        "75%": 1618.5,
        "90%": 1933.8000000000002,
        "95%": 1954,
        "99%": 1954,
        "99.9%": 1954,
        "99.99%": 1954,
        "count": 10,
        "max": 1954,
        "mean": 1137.1,
        "median": 1140,
        "min": 311,
        "stddev": 514.9278493148337
      },

and

    "aggregated_profile": [
      {
        "total_time_ns_stats": {
          "75%": 63369.75,
          "90%": 69523.5,
          "95%": 69691,
          "99%": 69691,
          "99.9%": 69691,
          "99.99%": 69691,
          "count": 10,
          "max": 69691,
          "mean": 39030.9,
          "median": 53758.5,
          "min": 3390,
          "stddev": 27635.790954666016
        },
        "num_eval": 1,
        "num_redo": 1,
        "location": {
          "file": "t.rego",
          "row": 9,
          "col": 2
        }
      }

The table format has been adjusted as well, and only displays a subset of
the statistics gathered:
Especially the 99.9 and 99.99 percentiles hardly matter for the sample
sizes I'd expect to be used here: they will almost certainly match "max".

The output is less unwieldy now:

    +------------------------------+---------+----------+---------------+----------------+---------------+
    |            METRIC            |   MIN   |   MAX    |     MEAN      |      90%       |      99%      |
    +------------------------------+---------+----------+---------------+----------------+---------------+
    | timer_rego_load_files_ns     | 349969  | 2549399  | 1.4760619e+06 | 2.5312689e+06  | 2.549399e+06  |
    | timer_rego_module_compile_ns | 1087507 | 24537496 | 1.120074e+07  | 2.41699473e+07 | 2.4537496e+07 |
    | timer_rego_module_parse_ns   | 275531  | 1915263  | 1.126406e+06  | 1.9016968e+06  | 1.915263e+06  |
    | timer_rego_query_compile_ns  | 61663   | 64395    | 63062.5       | 64374.1        | 64395         |
    | timer_rego_query_eval_ns     | 161812  | 1198092  | 637754        | 1.1846622e+06  | 1.198092e+06  |
    | timer_rego_query_parse_ns    | 6078    | 6078     | 6078          | 6078           | 6078          |
    +------------------------------+---------+----------+---------------+----------------+---------------+
    +----------+-------------+-------------+-------------+-------------+----------+----------+-----------------+
    |   MIN    |     MAX     |    MEAN     |     90%     |     99%     | NUM EVAL | NUM REDO |    LOCATION     |
    +----------+-------------+-------------+-------------+-------------+----------+----------+-----------------+
    | 43.875µs | 26.135469ms | 11.494512ms | 25.746215ms | 26.135469ms | 1        | 1        | data.rbac.allow |
    | 21.478µs | 211.461µs   | 98.102µs    | 205.72µs    | 211.461µs   | 1        | 1        | rbac.rego:13    |
    | 19.652µs | 123.537µs   | 73.161µs    | 122.75µs    | 123.537µs   | 1        | 1        | rbac.rego:40    |
    | 12.303µs | 117.277µs   | 61.59µs     | 116.733µs   | 117.277µs   | 2        | 1        | rbac.rego:50    |
    | 12.224µs | 93.214µs    | 51.289µs    | 92.217µs    | 93.214µs    | 1        | 1        | rbac.rego:44    |
    | 5.561µs  | 84.121µs    | 43.002µs    | 83.469µs    | 84.121µs    | 1        | 1        | rbac.rego:51    |
    | 5.56µs   | 71.712µs    | 36.545µs    | 71.158µs    | 71.712µs    | 1        | 0        | rbac.rego:45    |
    | 4.958µs  | 66.04µs     | 33.161µs    | 65.636µs    | 66.04µs     | 1        | 2        | rbac.rego:49    |
    | 4.326µs  | 65.836µs    | 30.461µs    | 65.083µs    | 65.836µs    | 1        | 1        | rbac.rego:6     |
    | 3.948µs  | 43.399µs    | 24.167µs    | 43.055µs    | 43.399µs    | 1        | 2        | rbac.rego:55    |
    +----------+-------------+-------------+-------------+-------------+----------+----------+-----------------+

Fixes open-policy-agent#3651.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
Signed-off-by: Dolev Farhi <farhi.dolev@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants