-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simpler stats timing #1
Simpler stats timing #1
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the changes I would recommend to make this test file more concise, more comprehensible, and more functional.
I documented my thought process in comments – I hope you will find them useful in your learning
@@ -68,13 +65,11 @@ def store_dashboard_time(code_fragment_name: str, timer: ec_timer.Timer): | |||
metadata_key="stats/dashboard_time", | |||
name=code_fragment_name, | |||
ts=timestamp, | |||
reading=elapsed_ms | |||
reading=timer.elapsed_ms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you look at the source of emission.core.timer
, you'll see that it actually has elapsed_ms
as a property. No need to do the conversion yourself
@@ -1,32 +1,14 @@ | |||
# emission/tests/funcTests/TestFunctionTiming.py | |||
# rename/move this to emission/tests/storageTests/TestStatsQueries.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole point of this file is to test store_dashboard_time
and store_dashboard_error
, right?
And those functions come from emission.storage.decorations.stats_queries.py
.
So, this test file should be called TestStatsQueries
and it should go under emission.tests.storageTests
. Look at the other test files under emission.tests
and you will see they follow the same convention.
import emission.core.timer as ect | ||
import emission.storage.decorations.stats_queries as esdsq | ||
import emission.storage.timeseries.abstract_timeseries as esta | ||
import emission.tests.common as etc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I adjusted these to the convention we use for imports within emission
. We abbreviate from the full module path (including e
for emission
).
ect
instead of ec_timer
esdsq
instead of sdq
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend listing the emission.*
imports alphabetically for tidiness
|
||
def test_single_function_timing(self): | ||
""" | ||
Test execution and timing of a single function. | ||
""" | ||
def sample_function(): | ||
logging.info("Executing sample_function") | ||
logging.debug("Executing sample_function") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use logging.info
sparingly; most of your less-important log statements should be logging.debug
logging.debug(f"Ensuring {len(expected_entries)} entries exist in DB.") | ||
key_list = [key for (key, _, _) in expected_entries] | ||
stored_entrys = list(self.timeseries_db.find_entries(key_list)) | ||
self.assertEqual(len(stored_entrys), len(expected_entries)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Retrieving all entries with the relevant keys can be simplified to this
with ect.Timer() as timer_both: | ||
with ect.Timer() as timer_one: | ||
function_one() | ||
esdsq.store_dashboard_time("function_one", timer_one) | ||
|
||
logging.info("Step 2: Data processing") | ||
time.sleep(1) # Simulate processing time | ||
processed_data = {k: str(v).upper() for k, v in data.items()} | ||
logging.info(f"Processed data: {processed_data}") | ||
with ect.Timer() as timer_two: | ||
function_two() | ||
esdsq.store_dashboard_time("function_two", timer_two) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might have nested timers. This will be necessary if you want to time the "stages" of some larger operation, as well as the total time of the operation.
# function_one should've taken ~1000ms; function_two should've taken ~1500ms; | ||
# both functions together should've taken ~2500ms | ||
self.assertAlmostEqual(timer_one.elapsed_ms, 1000, delta=100) | ||
self.assertAlmostEqual(timer_two.elapsed_ms, 1500, delta=100) | ||
self.assertAlmostEqual(timer_both.elapsed_ms, 2500, delta=100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Below, we will call verify_stats_entries
, which will make sure that the recorded entries line up with what the timers recorded.
But I also wanted to verify that what the timers recorded is actually reasonable. function_one
sleeps for 1s so it should finish in a little over 1000ms. I allowed 100ms tolerance by using assertAlmostEqual
with delta=100
. This was convenient because it saves us from writing out our own if
/ else
blocks to handle tolerance
Note that I didn't specifically know that delta
existed. I also couldn't remember the name for assertAlmostEqual
. But I googled "unittest close to" which led me to documentation for assertAlmostEqual
and I read about the delta
parameter there.
My point here is that you don't need to know everything, but if you learn where to find things you will save time and write cleaner code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we didnt need tolerance?
with ect.Timer() as timer: | ||
faulty_function() | ||
except ValueError as e: | ||
logging.info(f"Caught expected error: {e}") | ||
esdsq.store_dashboard_error('faulty_function', timer) | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pass
prevents the test from failing
etc.configLogging() | ||
unittest.main() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You had this before and I had mistakenly told you to remove it. I went back and looked at other files in emission.tests
and saw that they do have it, and they also call etc.configLogging()
which sets a logging format and the level to DEBUG
. So you won't need to configure the logger in setUpClass
def verify_stats_entries(self, expected_entries: list[tuple[str, str, float]]): | ||
""" | ||
Verifies that an entry for the given name exists under the specified key and matches the expected elapsed time. | ||
Verifies that each of these expected entries, in the form of (key, name, elapsed_ms), | ||
are stored in the database. | ||
|
||
Parameters: | ||
- key (str): The database key to query ("stats/dashboard_time" or "stats/dashboard_error"). | ||
- name (str): The name of the function or code block. | ||
- expected_elapsed_ms (float): The expected elapsed time in milliseconds. | ||
:param expected_entries: A list of tuples which have (key, name, expected_elapsed_ms). | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
verify_entry
was good because it reduced the repeated lines of code that you had in a previous version.
However, this approach prevents you from checking for the existence of multiple entries, which is something Shankari asked for.
You need to be able to check for multiple expected entries, so I changed this to accept a list and renamed it to verify_stats_entries
After it queries the timeseries for relevant keys, it iterates and compares stored vs. expected values at each index
No description provided.