Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Implement eviction in persistent cache #16289

Merged
merged 17 commits into from
Dec 20, 2024

Conversation

uditagarwal97
Copy link
Contributor

@uditagarwal97 uditagarwal97 commented Dec 5, 2024

This PR implements eviction for persistent cache.
Eviction is disabled by default and can be controlled by the user via SYCL_CACHE_MAX_SIZE variable.

Here's how eviction works:

  1. File size: A file called, cache_size.txt, is stored at the root of persistent cache. Every time a process adds a new entry to cache, it will also update the cache_size.txt file. This file is used to track the size of persistent cache. For backwards compatibility, if SYCL RT does not find cache_size.txt in the cache root, it will create once. All access to cache_size.txt are done using LockCacheItem, to prevent data races.
  2. When adding a new entry to cache, SYCL RT will check the cache_size.txt file and if the cache size exceeds the threshold, eviction is triggered.
  3. When a cache entry is created/accessed, SYCL RT create a file in that cache entry to store the access time. This file is later read during eviction.
  4. During eviction, SYCL RT will determine the last access time of each cache entry and items are evicted based on the LRU policy.

@uditagarwal97 uditagarwal97 marked this pull request as ready for review December 10, 2024 15:31
@uditagarwal97 uditagarwal97 requested a review from a team as a code owner December 10, 2024 15:31
@uditagarwal97
Copy link
Contributor Author

Marking this PR draft to address feedback received offline.

@uditagarwal97 uditagarwal97 marked this pull request as draft December 17, 2024 16:23
@uditagarwal97 uditagarwal97 marked this pull request as ready for review December 19, 2024 20:36
@uditagarwal97
Copy link
Contributor Author

uditagarwal97 commented Dec 19, 2024

@cperkinsintel Since your last review I made the following changes:

  1. Stored last access time in a file
  2. Did some cleanup
  3. Added logic to evict half of the cache when eviction is triggered.
  4. Updated KernelProgramCache doc to document cache eviction

@uditagarwal97
Copy link
Contributor Author

@intel/llvm-gatekeepers the PR is ready to be merged.

@sarnex sarnex merged commit 52015d8 into intel:sycl Dec 20, 2024
15 checks passed
againull pushed a commit that referenced this pull request Jan 1, 2025
#16289 implemented eviction for
persistent cache. This PR extends it to `kernel_compiler` cache as well.
sarnex pushed a commit that referenced this pull request Jan 6, 2025
Regression after: #16289
fixes #16515

**Problem**
The exception is thrown when one process tries to calculate cache size
and another process simultaneously inserted/removed any item from the
cache. This might cause std::filesystem::recursive_directory iterator to
throw std::filesystem_error.

**Solution**
We catch the exception and just skip over to the next item. This means
that while calculating the size of cache we might not consider the size
of any item added/removed from cache in the meanwhile. We calculate the
cache size only once (for all processes/threads using the same cache)
and that too the first-time persistent cache is used by any process. So,
this race is rather rare and just ignoring the exception would work.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants