Skip to content

Conversation

@ndodda-amazon
Copy link
Contributor

Description of changes:

For each step, we need to determine if the profiler config JSON has changed, and if so, we should reload the profiler config. Currently, we reload the JSON into memory and physically check whether the file contents have changed in order to determine if the profiler config should be reloaded. However, this may pose problems for performance at scale because we would be loading a JSON object into memory at each step.

This change replaces the above check by inspecting the file metadata for the last modified time. If the last modified time has changed, that means the file has changed and we should reload the profiler config. This is done without loading the JSON into memory (see tests, which verify that the config file is not accessed (read into memory) if the file has not been modified).

Style and formatting:

I have run pre-commit install to ensure that auto-formatting happens with every commit.

Issue number, if available

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@codecov-io
Copy link

codecov-io commented Mar 16, 2021

Codecov Report

Merging #463 (70b65d0) into master (433348d) will decrease coverage by 15.49%.
The diff coverage is 100.00%.

❗ Current head 70b65d0 differs from pull request most recent head 76a6cab. Consider uploading reports for the commit 76a6cab to get more accurate results
Impacted file tree graph

@@             Coverage Diff             @@
##           master     #463       +/-   ##
===========================================
- Coverage   65.62%   50.13%   -15.50%     
===========================================
  Files         172      162       -10     
  Lines       13260    12919      -341     
===========================================
- Hits         8702     6477     -2225     
- Misses       4558     6442     +1884     
Impacted Files Coverage Δ
smdebug/profiler/profiler_config_parser.py 68.66% <100.00%> (-15.80%) ⬇️
smdebug/profiler/utils.py 26.66% <100.00%> (-45.56%) ⬇️
smdebug/pytorch/__init__.py 0.00% <0.00%> (-100.00%) ⬇️
smdebug/pytorch/singleton_utils.py 0.00% <0.00%> (-100.00%) ⬇️
smdebug/profiler/analysis/utils/merge_timelines.py 0.00% <0.00%> (-97.52%) ⬇️
...ug/profiler/analysis/utils/pandas_data_analysis.py 0.00% <0.00%> (-97.37%) ⬇️
smdebug/profiler/analysis/python_stats_reader.py 0.00% <0.00%> (-94.29%) ⬇️
...debug/profiler/analysis/python_profile_analysis.py 0.00% <0.00%> (-90.91%) ⬇️
smdebug/pytorch/collection.py 0.00% <0.00%> (-90.00%) ⬇️
...er/analysis/utils/python_profile_analysis_utils.py 0.00% <0.00%> (-88.41%) ⬇️
... and 49 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 433348d...76a6cab. Read the comment docs.

assert case_insensitive_profiler_config_parser.config.detailed_profiling_config.num_steps == 3


def test_update_step_profiler_config_parser(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you run this test inside of a loop, to ensure that we do not have flaky behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be better to trivially use pytest.mark.parametrize so that the test itself doesn't change, but we simply run it multiple times and ensure it passes each time.

How many times do you suggest the test be run? 3?

Copy link
Contributor Author

@ndodda-amazon ndodda-amazon Mar 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, first and foremost, can you explain why this behavior might be flaky? The workflow seems to be clear cut:

  1. Set up the profiler config JSON
  2. Set up the profiler config parser object (which automatically loads the config)
  3. Load the config again
  4. Verify the last modified time has not changed
  5. Replace the config JSON
  6. Load the config again
  7. Verify the last modified time has changed

TL;DR the calls to load_config are controlled, so we know exactly the JSON that is at the path before loading the config each time.

"""
Get the last time that the file at the given filepath was modified, in the form of a datetime object.
"""
last_modified_time = Path(filepath).stat().st_mtime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain in comments, the reasons for the choice of this implementation?
A simple link to documentation might also suffice.
https://docs.python.org/3/library/stat.html#stat.ST_MTIME

Copy link
Contributor Author

@ndodda-amazon ndodda-amazon Mar 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The advantage of relying on file metadata to determine whether a profiler config should be reloaded is stated in the PR description - we cut down on the amount of times the JSON needs to be loaded into memory.

The choice of specifically using pathlib to get the file metadata was arbitrary - we can just as easily use os with os.path.getmtime(path). In fact, I probably will switch to os.path.getmtime(path) since it appears cleaner.


# check that reloading the config when it has changed will update the config fields.
monkeypatch.setenv("SMPROFILER_CONFIG_PATH", new_step_profiler_config_parser_path)
shutil.copy(new_step_profiler_config_parser_path, step_profiler_config_parser_path)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are currently failing since this call has different behavior for Linux and Unix (for Linux, this copies the file metadata too).

Will need to fix this by finding a better API for copying files or writing the JSON manually.

@ndodda-amazon ndodda-amazon changed the title Use file metadata to determine whether profiler config should be reloaded. [WIP] Use file metadata to determine whether profiler config should be reloaded. Mar 17, 2021
@ndodda-amazon
Copy link
Contributor Author

Closing for now since the root cause of the CI failures has to do with Linux flakiness in the metadata of the profiler config JSON not being updated. Will reopen once I've fixed this issue.

@ndodda-amazon ndodda-amazon reopened this Mar 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants