-
Notifications
You must be signed in to change notification settings - Fork 154
Cache Analysis History
During the development of better caching options, collection and graphing were performed around cache algorithm suggestions in terms of hit rates and performances (on a 2021 M1 Pro macbook pro, on CPython 3.11) in order to get an idea of relative performances and advantages.
This page collects those performed "by hand", although some cache simulations were also done via libCacheSim, mostly by primary author of libCacheSim and S3-Fifo Juncheng Yang upon request for assistance1.
min2 | fifo-lp1 | fifo-lp2 | clock1 | clock2 | lru | qdlru | lfu3 | qdlfu | qdlpfifo | s3fifo | sieve | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
10 | 10% | 1% | 1% | 1% | 1% | 1% | 3% | 3% | 3% | 3% | 3% | 3% |
20 | 15% | 2% | 2% | 2% | 2% | 2% | 5% | 6% | 6% | 6% | 6% | 5% |
50 | 23% | 4% | 4% | 4% | 4% | 4% | 11% | 11% | 12% | 12% | 12% | 12% |
100 | 30% | 8% | 8% | 8% | 8% | 7% | 18% | 18% | 18% | 19% | 19% | 19% |
200 | 39% | 14% | 15% | 14% | 15% | 13% | 26% | 26% | 27% | 28% | 28% | 27% |
500 | 51% | 28% | 29% | 28% | 29% | 26% | 38% | 39% | 39% | 40% | 40% | 39% |
1000 | 60% | 39% | 41% | 39% | 41% | 37% | 48% | 48% | 49% | 49% | 50% | 48% |
2000 | 67% | 51% | 52% | 51% | 52% | 49% | 56% | 57% | 57% | 57% | 58% | 57% |
5000 | 73% | 64% | 64% | 64% | 64% | 63% | 64% | 66% | 64% | 64% | 66% | 66% |
basic | basic-clearing | basic-random | basic-fifo | basic-fifolp1 | basic-fifolp2 | basic-clock0 | basic-clock1 | basic-clock2 | basic-lru | basic-qdlru | basic-lfu | basic-qdlfu | basic-qdlp | basic-s3fifo | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
10 | 420 | 419 | 418 | 420 | 417 | 415 | 414 | 414 | 415 | 424 | 414 | 413 | 410 | 410 | 412 |
20 | 420 | 423 | 419 | 417 | 415 | 412 | 411 | 409 | 411 | 412 | 399 | 398 | 392 | 390 | 390 |
50 | 420 | 407 | 407 | 417 | 407 | 404 | 402 | 402 | 401 | 404 | 385 | 377 | 365 | 365 | 361 |
100 | 420 | 398 | 392 | 388 | 385 | 384 | 388 | 384 | 383 | 386 | 341 | 339 | 341 | 336 | 338 |
200 | 420 | 391 | 371 | 371 | 363 | 358 | 370 | 362 | 362 | 364 | 311 | 309 | 304 | 305 | 304 |
500 | 420 | 354 | 330 | 326 | 300 | 293 | 326 | 302 | 296 | 310 | 257 | 253 | 253 | 252 | 250 |
1000 | 420 | 309 | 284 | 283 | 251 | 246 | 283 | 253 | 247 | 261 | 219 | 216 | 216 | 214 | 211 |
2000 | 420 | 262 | 236 | 238 | 205 | 200 | 237 | 205 | 200 | 213 | 185 | 184 | 185 | 185 | 181 |
5000 | 420 | 203 | 178 | 180 | 153 | 150 | 180 | 153 | 150 | 158 | 156 | 146 | 162 | 157 | 147 |
---
config:
xyChart:
width: 1200
height: 500
---
xychart-beta
x-axis "cache size" [10, 20, 50, 100, 200, 500, 1000, 2000, 5000]
y-axis "average (ms)" 0 --> 500
line "test" [420, 420, 420, 420, 420, 420, 420, 420, 420]
line [419, 423, 407, 398, 391, 354, 309, 262, 203]
line [418, 419, 407, 392, 371, 330, 284, 236, 178]
line [420, 417, 417, 388, 371, 326, 283, 238, 180]
line [417, 415, 407, 385, 363, 300, 251, 205, 153]
line [415, 412, 404, 384, 358, 293, 246, 200, 150]
line [414, 411, 402, 388, 370, 326, 283, 237, 180]
line [414, 409, 402, 384, 362, 302, 253, 205, 153]
line [415, 411, 401, 383, 362, 296, 247, 200, 150]
line [424, 412, 404, 386, 364, 310, 261, 213, 158]
line [414, 399, 385, 341, 311, 257, 219, 185, 156]
line [413, 398, 377, 339, 309, 253, 216, 184, 146]
line [410, 392, 365, 341, 304, 253, 216, 185, 162]
line [410, 390, 365, 336, 305, 252, 214, 185, 157]
line [412, 390, 361, 338, 304, 250, 211, 181, 147]
This only checked the effect of a small selection of cache replacement algorithms on an re2 base resolver, in order to see how (possibly unnecessary) caches would impact that.
re2 | re2-lru | re2-lfu | re2-s3fifo | |
---|---|---|---|---|
10 | 32 | 32 | 33 | 34 |
20 | 32 | 33 | 34 | 32 |
50 | 32 | 32 | 31 | 30 |
100 | 32 | 31 | 29 | 28 |
200 | 32 | 29 | 26 | 25 |
500 | 32 | 26 | 22 | 22 |
1000 | 32 | 22 | 19 | 19 |
2000 | 32 | 19 | 16 | 16 |
5000 | 32 | 14 | 14 | 14 |
Using re2's FilteredRE2
feature allows avoiding the evaluation of an input against every regex in the set, thanks to a prefilter process. However FilteredRE2
only hands out which regexes in the set match the input, data extraction is a secondary step.
This data extraction can be done by compiling the regex via the standard library and applying that, or by accessing the regexes FilteredRE2
has already compiled internally and applying that. The table below checks the two approaches:
re2+re | re2-lru+re | re2-s3fifo+re | re2-sieve+re | re2+re2 | re2-lru+re2 | re2-s3fifo+re2 | re2-sieve+re2 | |
---|---|---|---|---|---|---|---|---|
10 | 33 | 33 | 32 | 33 | 59 | 62 | 62 | 60 |
20 | 33 | 32 | 31 | 31 | 59 | 62 | 61 | 60 |
50 | 33 | 32 | 30 | 29 | 59 | 60 | 57 | 57 |
100 | 33 | 31 | 27 | 27 | 59 | 60 | 52 | 53 |
200 | 33 | 29 | 25 | 25 | 59 | 56 | 49 | 48 |
500 | 33 | 25 | 22 | 21 | 59 | 49 | 41 | 41 |
1000 | 33 | 22 | 19 | 19 | 59 | 42 | 37 | 36 |
2000 | 33 | 18 | 16 | 16 | 59 | 35 | 29 | 30 |
5000 | 33 | 15 | 14 | 13 | 59 | 27 | 25 | 25 |
While the timings are low compared to the basic regexes, and despite the overhead of re-compiling regexes via the standard library, the performance overhead of using re2 for extraction is a bit much (although compiled regexes are then cached, re-compiling on every extraction has not been benched).
1: Due to me having screwed up the implementation of qd-lp-fifo (the informal algorithm preceding S3) I was hitting very poor hit rates at higher cache sizes, the error was that all checks in the ghost cache essentially returned a miss, so the ghost cache's recovery feature never triggered.
2: Bélády's MIN (or OPTimal) is not a true cache eviction algorithm, it is an oracle which requires knowing the entire key sequence up-front and always makes the best possible decision (including ignoring keys entirely), I understand it has been proved optimal. It is provided as an informational and very theoretical upper bound.
3: A direct implementation of Matani and al's O(1) LFU, with no aging mechanism and thus mostly unsuitable to an online cache even if its potential issues did not show up on the available sample file.