irmin-pack: unify LRUs and add max memory config #2254

metanivek · 2023-06-01T20:38:39Z

At a high-level, this PR:

Unifies the previous 3 LRUs in irmin-pack into one
Adds a configuration option to bound the LRU by memory used instead of entry count
Maintains previous cap of "large" contents objects for both entry-based cap and memory-based cap

When deciding how to count the memory usage of objects in the LRU, two paths were evaluated: using Obj.reachable_words or using repr's size function. The former was too slow in replay benchmarks, so that latter is used. The size is used as-is for commits and contents, and a correction factor, based on benchmark observations, is applied to inode sizes.

The definition of "large" for contents objects is 20kB. This seems like a reasonable value based on previous analysis of object size which indicated that less than 0.1% of contents objects are larger than 512B. Note: the previous weight based limit would allow any object < ~500kB into the LRU (based on Octez's default configuration of 5000 entry cap), but lowering the cap allows more objects into the cache.

Conclusions based on benchmarks (to be pushed to the benchmarks repo once I finish tidying the analysis and running a final bench):

Entry-based cap still slightly out-performs the memory-cap. My assumption is that it is the overhead of memory usage calculation for inodes.
Both entry-based and memory-based can out perform 3.7.1 with only slightly more memory used.
Past a certain size increase, the LRU starts hurting performance of the benchmark. More could be investigated here in the future. I did not use cachecache's LRU for this, but with some modifications to its code, it may provide better scaling. I ran its benchmarks against the modified irmin LRU in this PR, and it does seem to scale better.

~~The final benchmark I want to run is an 80mb, but here is some high-level analysis.~~

Snippet of benchmark stats:

 |                          |   3.7.1   |   entry-15k    |   entry-30k    |   40mb-optim   |   80mb-optim
 | --                       | --        | --             | --             | --             | --
 | -- main metrics --       |           |                |                |                | 
 | CPU time elapsed         |   104m07s |    99m13s  95% |    97m38s  94% |   100m07s  96% |    98m20s  94%
 | Wall time elapsed        |   104m28s |    99m33s  95% |    97m56s  94% |   100m25s  96% |    98m38s  94%
 | TZ-transactions per sec  |   736.349 |   772.810 105% |   785.217 107% |   765.767 104% |   779.649 106%
 | TZ-operations per sec    |  4809.664 |  5047.817 105% |  5128.860 107% |  5001.817 104% |  5092.489 106%
 | Context.add per sec      | 14121.692 | 14820.934 105% | 15058.886 107% | 14685.872 104% | 14952.096 106%
 | tail latency (1)         |   0.224 s |   0.222 s  99% |   0.244 s 109% |   0.245 s 109% |   0.240 s 107%
 |                          |           |                |                |                | 
 | -- resource usage --     |           |                |                |                | 
 | disk IO (total)          |           |                |                |                | 
 |   IOPS (op/sec)          |   180_572 |   142_980  79% |   119_928  66% |   145_343  80% |   122_530  68%
 |   throughput (bytes/sec) |  18.060 M |  16.541 M  92% |  15.280 M  85% |  16.609 M  92% |  15.378 M  85%
 |   total (bytes)          | 112.826 G |  98.463 G  87% |  89.517 G  79% |  99.772 G  88% |  90.737 G  80%
 | disk IO (read)           |           |                |                |                | 
 |   IOPS (op/sec)          |   180_475 |   142_878  79% |   119_825  66% |   145_242  80% |   122_427  68%
 |   throughput (bytes/sec) |  10.639 M |   8.753 M  82% |   7.367 M  69% |   8.891 M  84% |   7.521 M  71%
 |   total (bytes)          |  66.466 G |  52.103 G  78% |  43.157 G  65% |  53.412 G  80% |  44.376 G  67%
 | disk IO (write)          |           |                |                |                | 
 |   IOPS (op/sec)          |        97 |       102 105% |       103 107% |       101 104% |       103 106%
 |   throughput (bytes/sec) |   7.421 M |   7.788 M 105% |   7.913 M 107% |   7.717 M 104% |   7.857 M 106%
 |   total (bytes)          |  46.360 G |  46.360 G 100% |  46.360 G 100% |  46.360 G 100% |  46.360 G 100%
 |                          |           |                |                |                | 
 | max memory usage (bytes) |   0.394 G |   0.401 G 102% |   0.428 G 108% |   0.416 G 106% |   0.469 G 119%
 | mean CPU usage           |      100% |      100%      |      100%      |      100%      |      100%

Here is memory usage comparisons. You can see:

entry-100k and 500mb regress in performance (not shown is entry-60k but it also regresses in perf)
entry-30k has the best performance (but 80mb is almost the same in perf and memory)
entry-15k has almost equivalent memory usage as 3.7.1 but better performance

art-w · 2023-06-09T14:42:49Z

src/irmin-pack/inode.ml


+    let to_kinded t = Node t
+    let of_kinded = function Node n -> n | _ -> assert false


I think I would prefer if this extensible type was hidden in the LRU implementation.. I think it should be possible to create behind the scene the fresh += Node new type constructor by keeping the LRU a functor to instantiate for each type of value that we want to store in it.

Interesting idea! I'll see what I can do. I do agree it would be nicer to hide this bookkeeping.

art-w · 2023-06-09T14:47:22Z

src/irmin-pack/unix/lru.ml

+  let hash = Hashtbl.hash
+end)
+
+type key = int63


I don't think it's a problem atm, but just to be sure: this works because we only use the LRU once per kind of objects, and the keys of the different kind of objects are naturally distinct since they are file offsets?

Hmm, this is a good point to bring up!

A problem could arise if a particular offset is re-used for a different object type than what was originally cached, but that can't happen since it is correlated to the ever-growing file offsets of the pack file (as you say). The unified LRU, as it currently is written, is definitely tied to these implementation details. I was trying to avoid a functorized LRU for simplicity, but (thinking out loud) if we move to that, perhaps I can encode the type in the key as well.

Yeah hm, but having an extensible key type that support custom hashing/equality is a whole new world of pain... which is how I ended up thinking about "not sharing the hashtables" to sidestep the issue :P

art-w · 2023-06-09T15:13:23Z

To expand a bit on my comments, did you consider keeping the hashtables separate (since they cause type issues), but sharing the Q.t list across the different LRUs? (so that they have the ability to make room from other hashtables when needed)

irmin/src/irmin/lru.ml

Lines 78 to 84 in 3222d26

    
           type 'a t = { 
        
             ht : (key * 'a) Q.node HT.t; 
        
             q : (key * 'a) Q.t; 
        
             mutable cap : int; 
        
             mutable w : int; 
        
             weight : 'a -> int; 
        
           }

The type q: (key * 'a) Q.t is an issue, but I believe it could be replaced by q: (unit -> unit) Q.t instead (a closure to free an element from whichever hashtable it was in and subtracting the removed weight from the total used space...)

(Note that this is just a thought that popped into my head! I don't think there's a strong incentive to do things differently as the current solution works nicely :) )

metanivek · 2023-06-09T15:37:45Z

Ah, thanks for sharing your extra thoughts! I didn't think about changing the core LRU implementation too much since I wanted to keep options of switching it in the future (maybe a modified cachecache or some lock-free version, for instance). If it looks too complicated to bring the extensible type into the LRU, I think we should push it out to a future time when we evaluate the actual LRU implementation.

Previously, an `irmin-pack` store had one LRU per `Pack_store`, resulting in three LRUs corresponding to the three object stores: Commit, Node, and Content. This commit changes an `irmin-pack` store to only have one LRU that is shared by each object store. The motivation of this change is to make configuring the LRU size more intuitive for users of `irmin-pack`.

@metanivek

…min-pack, irmin-pack-tools, irmin-mirage, irmin-mirage-graphql, irmin-mirage-git, irmin-http, irmin-graphql, irmin-git, irmin-fs, irmin-containers, irmin-cli, irmin-chunk and irmin-bench (3.8.0) CHANGES: ### Added - **irmin** - Change behavior of `Irmin.Conf.key` to disallow duplicate key names by default. Add `allow_duplicate` optional argument to override. (mirage/irmin#2252, @metanivek) - **irmin-pack** - Add maximum memory as an alternative configuration option, `lru_max_memory`, for setting LRU capacity. (@metanivek, mirage/irmin#2254) ### Changed - **irmin** - Lower bound for `mtime` is now `2.0.0` (mirage/irmin#2166, @patricoferris) - **irmin-mirage-git** - Lower bound for `mirage-kv` is now `6.0.0` (mirage/irmin#2256, @metanivek) ### Fixed - **irmin-cli** - Changed `--store irf` to `--store fs` to align the CLI with what is published on the Irmin website (mirage/irmin#2243, @wyn)

Irmin 3.8 unifies 3 LRUs into 1 so we allow 3 more items into the unified LRU See mirage/irmin#2254

metanivek requested review from art-w and adatario June 1, 2023 20:38

art-w reviewed Jun 9, 2023

View reviewed changes

metanivek force-pushed the new_lru branch from b1ce5a9 to f3c588c Compare June 13, 2023 16:07

metanivek added 6 commits July 4, 2023 14:22

irmin: remove custom weight function from LRU

d842972

irmin-pack: add lru_max_memory to config options

6780cea

irmin-pack: add memory-bounded lru

fa7b162

irmin-pack: cap size of allowed contents in LRU

bf13daf

Add CHANGES entry

eae015c

metanivek force-pushed the new_lru branch from f3c588c to eae015c Compare July 4, 2023 18:22

metanivek merged commit 78f6cc1 into mirage:main Jul 4, 2023

metanivek deleted the new_lru branch July 4, 2023 19:19

metanivek mentioned this pull request Jul 4, 2023

[new release] irmin project (3.8.0) ocaml/opam-repository#24056

Merged

metanivek mentioned this pull request Aug 1, 2023

irmin-pack: improve UX of the LRU parameters #2060

Closed

tezoslibrarian pushed a commit to tezos/tezos-mirror that referenced this pull request Dec 15, 2023

Deps: adapt to irmin 3.8

bb853f6

Irmin 3.8 unifies 3 LRUs into 1 so we allow 3 more items into the unified LRU See mirage/irmin#2254

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

irmin-pack: unify LRUs and add max memory config #2254

irmin-pack: unify LRUs and add max memory config #2254

metanivek commented Jun 1, 2023 •

edited

Loading

art-w Jun 9, 2023

metanivek Jun 9, 2023

art-w Jun 9, 2023

metanivek Jun 9, 2023

art-w Jun 9, 2023

art-w commented Jun 9, 2023

metanivek commented Jun 9, 2023


		let to_kinded t = Node t
		let of_kinded = function Node n -> n \| _ -> assert false

irmin-pack: unify LRUs and add max memory config #2254

irmin-pack: unify LRUs and add max memory config #2254

Conversation

metanivek commented Jun 1, 2023 • edited Loading

art-w Jun 9, 2023

Choose a reason for hiding this comment

metanivek Jun 9, 2023

Choose a reason for hiding this comment

art-w Jun 9, 2023

Choose a reason for hiding this comment

metanivek Jun 9, 2023

Choose a reason for hiding this comment

art-w Jun 9, 2023

Choose a reason for hiding this comment

art-w commented Jun 9, 2023

metanivek commented Jun 9, 2023

metanivek commented Jun 1, 2023 •

edited

Loading