-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feat] Introduce permanent result storage and support performance results comparisons between runs #3227
[feat] Introduce permanent result storage and support performance results comparisons between runs #3227
Conversation
Hello @vkarak, Thank you for updating! Cheers! There are no PEP8 issues in this Pull Request!Do see the ReFrame Coding Style Guide Comment last updated at 2024-08-28 13:30:22 UTC |
Still testing the PR and going through it. Just some preliminary comment regarding the open issues for discussion:
I think both are good ideas. The second one in particular is useful to not go through the report manually to find it. I would like to add some more items to discuss:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some general issues:
- Not sure how we would properly address this but let's say you are comparing two runs and using
max
as aggregator. What would be the way to find the uuid of the session? Currently you have to manually look in the report, right? - Same issue of the
--distribute
comes up with--repeat
. - In the performance report now you see only one entry for each test (in case of retries). This is a different behaviour than the past one and I think some people may prefer to see the individual results.
- We are not printing anymore the title
======================= PERFORMANCE REPORT ----------------------------
- I tried running two reframe runs in parallel and the database can get silently corrupted. Maybe not an immediate issue, but good to keep in mind.
From the first run I got:The second:┍━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━━━━━━━┯━━━━━━━━━━┑ │ name │ pvar │ pval │ punit │ pdiff │ job_nodelist │ result │ ┝━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━┿━━━━━━━━━━┿━━━━━━━━━┿━━━━━━━━━┿━━━━━━━━━━━━━━━━━┿━━━━━━━━━━┥ │ Foorun %$repeat_no=1 │ extract_bw │ 0.175468 │ MB/s │ -63.21% │ vpn-110.cscs.ch │ fail │ ├───────────────────────┼────────────┼──────────┼─────────┼─────────┼─────────────────┼──────────┤ │ Foorun %$repeat_no=0 │ extract_bw │ 2.33529 │ MB/s │ -60.05% │ vpn-110.cscs.ch │ fail │ ├───────────────────────┼────────────┼──────────┼─────────┼─────────┼─────────────────┼──────────┤ │ Foorun2 %$repeat_no=1 │ copy_bw │ 2.33805 │ MB/s │ -74.89% │ vpn-110.cscs.ch │ fail │ ├───────────────────────┼────────────┼──────────┼─────────┼─────────┼─────────────────┼──────────┤ │ Foorun2 %$repeat_no=0 │ copy_bw │ 3.36739 │ MB/s │ -23.22% │ vpn-110.cscs.ch │ fail │ ┕━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━━━━━━┷━━━━━━━━━━┙ Current session stored with UUID: 20240703T161944 (printed the run_report as well: run-report-601.json)
And then when I run with┍━━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━━━━━━━━━┯━━━━━━━━━━┑ │ name │ pvar │ pval │ punit │ pdiff │ job_nodelist │ result │ ┝━━━━━━━━━┿━━━━━━━━━━━━┿━━━━━━━━━━┿━━━━━━━━━┿━━━━━━━━━━┿━━━━━━━━━━━━━━━━━┿━━━━━━━━━━┥ │ Foorun2 │ copy_bw │ 0.387238 │ MB/s │ -95.21% │ vpn-110.cscs.ch │ fail │ ├─────────┼────────────┼──────────┼─────────┼──────────┼─────────────────┼──────────┤ │ Foorun │ extract_bw │ 5.07053 │ MB/s │ +521.14% │ vpn-110.cscs.ch │ pass │ ┕━━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━━━━━━━━━┷━━━━━━━━━━┙ Current session stored with UUID: 20240703T161944 run report file: run-report-600.json
--performance-compare="^20240703T161944/^20240703T161944/mean:+system+partition+environ/+result"
I get:┍━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━━━━━┯━━━━━━━━━━━┯━━━━━━━━━━┑ │ name │ pvar │ pval │ punit │ pdiff │ system │ partition │ environ │ result │ ┝━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━┿━━━━━━━━━━┿━━━━━━━━━┿━━━━━━━━━┿━━━━━━━━━━┿━━━━━━━━━━━━━┿━━━━━━━━━━━┿━━━━━━━━━━┥ │ Foorun2 │ copy_bw │ 0.387238 │ MB/s │ +0.00% │ generic │ default │ builtin │ fail │ ├──────────────────────┼────────────┼──────────┼─────────┼─────────┼──────────┼─────────────┼───────────┼──────────┤ │ Foorun │ extract_bw │ 5.07053 │ MB/s │ +0.00% │ generic │ default │ builtin │ pass │ ├──────────────────────┼────────────┼──────────┼─────────┼─────────┼──────────┼─────────────┼───────────┼──────────┤ │ Foorun %$repeat_no=1 │ extract_bw │ 0.175468 │ MB/s │ +0.00% │ generic │ default │ builtin │ fail │ ┕━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━━━━━┷━━━━━━━━━━━┷━━━━━━━━━━┙
I think the
I think the solution to this naming problem should probably be independent to how we just treat it in the results DB, but I agree that it needs fixing in this context.
This is now possible with the
If you add
Grouping by
Fixed.
Fixed, we are now using file locks to guard against race conditions. |
Hm I am getting a key error for both |
7be3dd2
to
e9f34ef
Compare
Indeed, since these refer only to test case attributes at the moment. |
A follow up comment regarding this. Currently, extra columns are aggregated only for the base period and not from the target one. I see it as analyzing the current period and comparing it to a target one, so the extra information should refer to the base one. Users could simply swap the period arguments to |
63285ba
to
fa078f4
Compare
Now this is supported with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to test it a bit more before we merge, but a couple of small comments.
97e1719
to
d2e0844
Compare
c2a35ce
to
dd37f86
Compare
Co-authored-by: Theofilos Manitaras <manitaras@cscs.ch>
This PR enables performance result comparison natively in ReFrame.
New functionality
The
--performance-report
is extended to accept an optional specifier that controls how the performance results of the current run will be compared to past results.A new
--performance-compare
option is provided that allows ReFrame to compare the performance results of two past periods of time.The syntax of both options is very similar:
Each time period is specified in the following simplified syntax:
Allowed timestamps must follow any of the formats below:
A time period can also be specified implicitly through a session UUID.The session UUID is essentially the start timestamp of the run session in the format
%Y%m%dT%H%M%S%z
and is always printed at the end of the run session.The session UUID is a unique identifier assigned automatically from ReFrame.
If you want to use a session UUID as a timestamp you should specify it as
^<session_uuid>
.A session UUID can also be passed
specifiedinstead of a timestampwith, in which case the test cases of that particular session will be used.^<session_uuid>
An aggregation specifier follows the syntax:
Finally, the extra columns can be specified with the following syntax:
Note that the above EBNF forms are simplified in terms of the accepted column names and/or timestamps.
Note also that the column names accepted are the the same as the log format specifiers but without the the
check_
prefix.Results are always aggregated by test name, performance variable name and performance variable unit and this cannot be changed. User can only provide additional columns to group.
Similarly, the output will contain all the group by columns plus any additional columns specified by the user. For non-aggregated columns, the unique values across all samples will be combined as a single string using the
|
separator.Examples
--performance=now:now-7d/mean:+system+partition+environment/+result
--performance-compare=2df155f4-a6bc-443a-ab95-2b37b4b1b2e7/^d13d0498-ceee-4df2-971c-8a844af71b8a/median:+job_nodelist/+result
Key design concepts
Upon finishing running the tests and after generating the JSON run report, ReFrame stores the run results in a database. The full JSON report blob is stored and the test cases of the run are indexed by their
job_completion_time_unix
.Querying results is purposefully kept simple and is fulfilled by
a single APItwo API calls to fetch all the test cases that have run in a certain period of time or in a specific session.Even querying by a session UUID goes through this path, by first obtaining the time span of the session.The obtained test cases are then grouped by the specified attributes and their performance values are aggregated using the aggregation function.
Finally, a 2D array with the table data is returned to the caller.
New options for querying session and test case data:
--delete-stored-session <uuid>
: delete a stored session by its UUID--describe-stored-session <uuid>
: return a detailed JSON record about a specific session--describe-stored-testcases <session_uuid>|<testcase_uuid>|<time_period>
: return a detailed JSON about a specific test case or about the test cases of the specific session or time period--list-stored-sessions
: List brief information about all stored sessions--list-stored-testcases <session_uuid>|<testcase_uuid>|<time_period>
: List brief information about a specific test case or about the test cases of the specific session or time periodImplementation details
The database file is placed inside the report file folder and is named
results.db
and it cannot be changed.Also the report format is changed thus we bump its data version to 4.0. The
check_vars
andcheck_params
are flattened and are stored inside thetestcase
.Code organization
The existing reporting code is considerably refactored and obviously new code is added. Here is a summary of the structure:
reframe.frontend.statistics
is removed and theTestStats
is deprived of any reporting functionality. Its definition is moved toreframe.frontend.executors
and it's only purpose is to hold statistics of the run as ReFrame runs the tests.reframe.frontend.printer
gets all the responsibility for formatting and presenting the various reports (failure report, retry report, performance report etc.)reframe.frontend.reporting
holds everything related to generating a run report and restoring a session from a run report. It also contains all the analytics functionality described above.reframe.frontend.reporting.storage
provides an abstraction to the backend storage for storing run reports. It essentially provides the basic API for querying the test cases that have run in a certain period of time.Finally, this PR fixes some coding style issues to make flake8 happy again.
Other changes
--timestamp-dirs
uses%Y
instead of%y
which was a typo.--timestamp-dirs
that did not respect the respective environment variable. This was a limitation to the current implementation ofArgumentParser
, so we had to enhance our own version also in the aspect to support the scenario of optional arguments with default values (ie., `nargs='?', const='value').Todos
--describe-stored-session
).^
from session UUID specs in the command-line options--list-stored-sessions
to accept an optional argumentOpen issues to discuss
Add the option to not store the json reports at all as now these are stored in the database? This would require us to add an option to list and show old reports from the command line.We likely need an option to list quickly the stored sessions by UUID so as to be able to easily select two sessions.Closes #2733.