-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat[next]: Add memory and disk-based caching to more workflow steps #1690
Conversation
Merge main
tests/next_tests/integration_tests/feature_tests/ffront_tests/test_execution.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/config.py
Outdated
@@ -73,6 +73,9 @@ def env_flag_to_bool(name: str, default: bool) -> bool: | |||
) | |||
|
|||
|
|||
GTFN_SOURCE_CACHE_DIR: str = os.environ.get(f"{_PREFIX}_GTFN_SOURCE_CACHE_DIR", "gtfn_cache") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to avoid exposing a config option that is special to a certain backend. Is it necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We introduced this since we wanted to avoid hardcoding it. In almost all cases, this should not change but e.g. in the tests, where the cache should be deleted afterwards, specifying a different directory would make sense.
Do you have a suggestion on how to make this configurable which you would prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DaCe has also a cache directory, maybe we could use a common config option for compiled backends?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that sounds good to me. Do you agree @havogt?
And do you have a name in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally proposed this, but after thinking about it again, and together with my other suggestion about how to deal with this in tests, I think it's not needed at least for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I removed it and hardcoded it in gtfn.py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Final comments.
.../next_tests/unit_tests/program_processor_tests/codegens_tests/gtfn_tests/test_gtfn_module.py
Outdated
Show resolved
Hide resolved
.../next_tests/unit_tests/program_processor_tests/codegens_tests/gtfn_tests/test_gtfn_module.py
Outdated
Show resolved
Hide resolved
.../next_tests/unit_tests/program_processor_tests/codegens_tests/gtfn_tests/test_gtfn_module.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/config.py
Outdated
@@ -73,6 +73,9 @@ def env_flag_to_bool(name: str, default: bool) -> bool: | |||
) | |||
|
|||
|
|||
GTFN_SOURCE_CACHE_DIR: str = os.environ.get(f"{_PREFIX}_GTFN_SOURCE_CACHE_DIR", "gtfn_cache") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally proposed this, but after thinking about it again, and together with my other suggestion about how to deal with this in tests, I think it's not needed at least for this PR.
.../next_tests/unit_tests/program_processor_tests/codegens_tests/gtfn_tests/test_gtfn_module.py
Show resolved
Hide resolved
@@ -148,4 +153,3 @@ def add_foast_located_node_to_fingerprint( | |||
) -> None: | |||
add_content_to_fingerprint(obj.location, hasher) | |||
add_content_to_fingerprint(str(obj), hasher) | |||
add_content_to_fingerprint(str(obj), hasher) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure, why this was done twice, does removing it make sense @egparedes / @DropD ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like a bug to me and I agree with your change.
cscs-ci run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed the updated requirements and it looks ready to me. Let's just wait for @DropD lightweight review and we are good to merge from my point of view.
tests/next_tests/integration_tests/feature_tests/ffront_tests/test_execution.py
Outdated
Show resolved
Hide resolved
.../next_tests/unit_tests/program_processor_tests/codegens_tests/gtfn_tests/test_gtfn_module.py
Outdated
Show resolved
Hide resolved
.../next_tests/unit_tests/program_processor_tests/codegens_tests/gtfn_tests/test_gtfn_module.py
Outdated
Show resolved
Hide resolved
.../next_tests/unit_tests/program_processor_tests/codegens_tests/gtfn_tests/test_gtfn_module.py
Outdated
Show resolved
Hide resolved
.../next_tests/unit_tests/program_processor_tests/codegens_tests/gtfn_tests/test_gtfn_module.py
Outdated
Show resolved
Hide resolved
.../next_tests/unit_tests/program_processor_tests/codegens_tests/gtfn_tests/test_gtfn_module.py
Outdated
Show resolved
Hide resolved
.../next_tests/unit_tests/program_processor_tests/codegens_tests/gtfn_tests/test_gtfn_module.py
Outdated
Show resolved
Hide resolved
Thanks for highlighting the unused imports, I removed all of them. Applying ruff to the tests as well would help to avoid this in the future. |
#1690 included a change to make the hash of an `itir.FencilDefinition` stable across multiple runs. This PR adopts the same change to an `itir.Program`,
Add memory and disk-based caching to other workflow steps and, therefore, removing unnecessary overhead of Program calls and significantly improving time to first computed value.
Changes:
cached = True
forfunc_to_past_factory
CachedStep
(usingDiskcache
) which is activated when settingotf_workflow__cached_translation=True
, similar as in PR#1474 (without CachedStep)ProgramDefinition
This leads to a runtime decrease of about 25% for PMAP-G in the advect-uniform testcase (5 hours) after caches are populated.
TODOs:
fingerprint_stage