-
-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
function to evict one item #459
Comments
Hi. Sorry for the very late reply. I was sick and could not participate in OSS activities for almost half a year. I am still not fully recovered, but will be able to work on
Thanks for the suggestion. Actually, implementing The reason it is difficult to implement To explain some of the internals of |
Perhaps it would be helpful to understand the use-case for why using both a size and weight threshold at once is helpful? Can you elaborate on the scenario? I can only recall two times it was requested for another language ecosystem's caching library.
Since it seems rare and easy to workaround, there wasn't enough demand or use-cases to incentivize me to allow both. I'd recommend checking if there are features this cache might expose that is more generally useful (e.g. snapshot of the eviction order) that would allow for a workaround. Then you have more options to consider before deciding where to invest your development time. |
Hi. Thanks for the suggestion.
I think it is a good idea to provide the read-only eviction order snapshot, as it will be more generally useful. Taking a snapshot has some performance overhead, so it should be explicitly enabled. Also, I would make sure that the snapshot is only taken when pending maintenance tasks are being performed, to avoid lock contention. // Create an unbounded cache.
let cache = moka::sync::Cache::builder().build();
// Enable the eviction order snapshot with the maximum entries of 64.
cache.policy().enable_eviction_order_snapshot(64);
// Make sure the first snapshot is taken.
cache.run_pending_tasks(); Users can get the snapshot by calling the Each
Users can remove an entry from the cache by its Implementing this feature will take much longer time than limiting the total weight and number of items.
|
I figured, perhaps naively, that snapshots would be relatively rare, short, and that the read/write buffering would mask the lock contention challenges. Therefore, I made it a blocking operation that locked on-demand. Since there are many policy-specific operations that user's might want, but which are hard to generalize or would make the core API confusing, I place all of these under Cache.policy(). The Policy interface contains |
Thank you for the feedback. It was very helpful.
I am not sure if it is relatively rare. Wanting to have a method to evict an entry implies that calling it almost every time after inserting an entry. cache.insert(key, value);
if cache.entry_count() > MAX_ENTRIES {
cache.evict();
}
// By the way, this code will not work as you may expect, because the entry_count method
// may return a stale value. You should call run_pending_tasks method to get an updated
// value, but calling it is expensive. But we can warn in the documentation that getting a snapshot is a blocking operation and it should be used with caution.
I have a similar idea. So
What do you mean by "at the time of capturing"? Do you mean that all items in the read/write buffer are applied to the LRU queue before the snapshot is taken?
It will be helpful. I was also thinking of adding some counters to the cache statistics (metrics): pub struct DetailedStatsCounter {
...
read_drop_count: AtomicCell<u64>,
write_wait_count: AtomicCell<u64>,
total_write_wait_time_nanos: AtomicCell<u64>,
... As for logging, I am not eager to add it at this time because there are two major logging facilities in Rust: log and tracing. And we, as a library authors, will need to support both once we add some logs. |
A similar request: #471 |
I would argue that the write rate is relatively small in comparison to other operations. The cost preceding the write, a data load, is likely measured in milliseconds (aka the miss penalty). In my concurrent benchmarks a simple lock-based LRU achieves 10M ops/s, which gives us roughly the maximum throughput for a mutex. If the cache is performing hundreds or fewer writes per second, the penalty of blocking on the lock to obtain next victim wouldn't be problematic.
I actually meant the user-visible entry given to the function is a copy, not the live entry, and that immediately after the snapshot then the cache may be in a different state due to concurrency. I do perform maintenance prior the calling the user's function when snapshotting.
That's fair! I log very rarely and use the standard library's minimalistic api which merely routes to whatever the application has set up. |
Hi. Thanks for clarifying. Ah, it was relatively small compared to other operations including reads. I see. I misunderstood the context. Also the explanation about the snapshot was helpful! |
that was the reason, though i'm not sure how useful it would be to limit most caches (including mine) by count and size |
Action Items
Relevant Pull RequestsExampleuse std::sync::{LazyLock, Mutex};
use moka::sync::Cache;
/// Try to truncate the cache to the target size (number of entries).
///
/// This function will try to remove some cold entries from the cache to reduce
/// the number of entries to the target size. It will early return if the number
/// of entries to remove is less than a threshold.
///
/// Only one thread will remove entries at a time. Other threads will early
/// return if they find that the lock is already held by another thread.
pub fn try_truncate<K, V>(cache: &Cache<K, V>, target: u64) {
// Adjust this value to match your needs.
const THRESHOLD: u64 = 5;
const MAX_COLD_ENTRIES: usize = THRESHOLD as usize * 2;
// Calculate the number of entries to remove. NOTE: entry_count may return a
// stale value. The value will be updated when the maintenance tasks are
// processed.
let mut n = cache().entry_count().saturating_sub(target);
if n < THRESHOLD {
return;
}
static TRUNCATE_LOCK: LazyLock<Mutex<()>> = LazyLock::default();
// Race to remove entries.
let Ok(lock) = TRUNCATE_LOCK.get.try_lock() else {
// Another thread is already removing entries.
return;
}
// Get some cold entires. NOTE: Calling `capture` will process maintenance
// tasks, and then create and return a snapshot.
let snapshot = cache
.policy()
.snapshot()
.with_cold_entries(MAX_COLD_ENTRIES)
.capture();
// Calculate the number of entries to remove again using the latest value.
// (It should be updated when calling `capture` above)
n = cache().entry_count().saturating_sub(target);
for entry_info in snapshot.cold_entries() {
if n == 0 {
return;
}
let r = cache
.entry_by_ref(entry_info.key())
.with_metadata()
.and_remove_if(|_k, _v, metadata| {
// Returns `true` if the entry is still cold.
metadata.last_accessed() == entry_info.metadata().last_accessed()
});
match r {
// The entry was removed by us.
Removed(_)
// The entry may have been removed by another thread, Or it may have
// expired and already removed from the cache.
| NotExist
// The entry has already expired but was still in the cache. Then it
// was removed by us.
| ExpiredAndRemoved(_) =>
// In all above cases, consider we evicted an entry.
n -= 1,
// The predicate returned `false`. The entry may have been accessed
// or updated by other thread(s), so we should not remove it.
NotRemoved => (),
}
// This will make sure that the lock is held until here.
drop(lock);
} |
currently moka allow limiting entries it stores using item weights. If a weighter is provided, the weights it returns are used, otherwise all weights are assumed to be one, which means the limit is in number of items and not based on their size.
I'd like to be able to limit both total weight, and number of items. Afaict, there isn't currently a way to do that. One thing i thought of that would allow being more generic on unusual constraints like that would be to provide an
fn evict(&self) -> Option<K, V>
which evicts one item according to some application rules, while still leveraging moka's eviction policyThe text was updated successfully, but these errors were encountered: