Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

irmin-pack: unify LRUs and add max memory config #2254

Merged
merged 6 commits into from
Jul 4, 2023

Conversation

metanivek
Copy link
Member

@metanivek metanivek commented Jun 1, 2023

At a high-level, this PR:

  1. Unifies the previous 3 LRUs in irmin-pack into one
  2. Adds a configuration option to bound the LRU by memory used instead of entry count
  3. Maintains previous cap of "large" contents objects for both entry-based cap and memory-based cap

When deciding how to count the memory usage of objects in the LRU, two paths were evaluated: using Obj.reachable_words or using repr's size function. The former was too slow in replay benchmarks, so that latter is used. The size is used as-is for commits and contents, and a correction factor, based on benchmark observations, is applied to inode sizes.

The definition of "large" for contents objects is 20kB. This seems like a reasonable value based on previous analysis of object size which indicated that less than 0.1% of contents objects are larger than 512B. Note: the previous weight based limit would allow any object < ~500kB into the LRU (based on Octez's default configuration of 5000 entry cap), but lowering the cap allows more objects into the cache.

Conclusions based on benchmarks (to be pushed to the benchmarks repo once I finish tidying the analysis and running a final bench):

  1. Entry-based cap still slightly out-performs the memory-cap. My assumption is that it is the overhead of memory usage calculation for inodes.
  2. Both entry-based and memory-based can out perform 3.7.1 with only slightly more memory used.
  3. Past a certain size increase, the LRU starts hurting performance of the benchmark. More could be investigated here in the future. I did not use cachecache's LRU for this, but with some modifications to its code, it may provide better scaling. I ran its benchmarks against the modified irmin LRU in this PR, and it does seem to scale better.

The final benchmark I want to run is an 80mb, but here is some high-level analysis.

Snippet of benchmark stats:

 |                          |   3.7.1   |   entry-15k    |   entry-30k    |   40mb-optim   |   80mb-optim
 | --                       | --        | --             | --             | --             | --
 | -- main metrics --       |           |                |                |                | 
 | CPU time elapsed         |   104m07s |    99m13s  95% |    97m38s  94% |   100m07s  96% |    98m20s  94%
 | Wall time elapsed        |   104m28s |    99m33s  95% |    97m56s  94% |   100m25s  96% |    98m38s  94%
 | TZ-transactions per sec  |   736.349 |   772.810 105% |   785.217 107% |   765.767 104% |   779.649 106%
 | TZ-operations per sec    |  4809.664 |  5047.817 105% |  5128.860 107% |  5001.817 104% |  5092.489 106%
 | Context.add per sec      | 14121.692 | 14820.934 105% | 15058.886 107% | 14685.872 104% | 14952.096 106%
 | tail latency (1)         |   0.224 s |   0.222 s  99% |   0.244 s 109% |   0.245 s 109% |   0.240 s 107%
 |                          |           |                |                |                | 
 | -- resource usage --     |           |                |                |                | 
 | disk IO (total)          |           |                |                |                | 
 |   IOPS (op/sec)          |   180_572 |   142_980  79% |   119_928  66% |   145_343  80% |   122_530  68%
 |   throughput (bytes/sec) |  18.060 M |  16.541 M  92% |  15.280 M  85% |  16.609 M  92% |  15.378 M  85%
 |   total (bytes)          | 112.826 G |  98.463 G  87% |  89.517 G  79% |  99.772 G  88% |  90.737 G  80%
 | disk IO (read)           |           |                |                |                | 
 |   IOPS (op/sec)          |   180_475 |   142_878  79% |   119_825  66% |   145_242  80% |   122_427  68%
 |   throughput (bytes/sec) |  10.639 M |   8.753 M  82% |   7.367 M  69% |   8.891 M  84% |   7.521 M  71%
 |   total (bytes)          |  66.466 G |  52.103 G  78% |  43.157 G  65% |  53.412 G  80% |  44.376 G  67%
 | disk IO (write)          |           |                |                |                | 
 |   IOPS (op/sec)          |        97 |       102 105% |       103 107% |       101 104% |       103 106%
 |   throughput (bytes/sec) |   7.421 M |   7.788 M 105% |   7.913 M 107% |   7.717 M 104% |   7.857 M 106%
 |   total (bytes)          |  46.360 G |  46.360 G 100% |  46.360 G 100% |  46.360 G 100% |  46.360 G 100%
 |                          |           |                |                |                | 
 | max memory usage (bytes) |   0.394 G |   0.401 G 102% |   0.428 G 108% |   0.416 G 106% |   0.469 G 119%
 | mean CPU usage           |      100% |      100%      |      100%      |      100%      |      100%     

Here is memory usage comparisons. You can see:

  1. entry-100k and 500mb regress in performance (not shown is entry-60k but it also regresses in perf)
  2. entry-30k has the best performance (but 80mb is almost the same in perf and memory)
  3. entry-15k has almost equivalent memory usage as 3.7.1 but better performance

mem-used

@metanivek metanivek requested review from art-w and adatario June 1, 2023 20:38

let to_kinded t = Node t
let of_kinded = function Node n -> n | _ -> assert false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would prefer if this extensible type was hidden in the LRU implementation.. I think it should be possible to create behind the scene the fresh += Node new type constructor by keeping the LRU a functor to instantiate for each type of value that we want to store in it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting idea! I'll see what I can do. I do agree it would be nicer to hide this bookkeeping.

let hash = Hashtbl.hash
end)

type key = int63
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a problem atm, but just to be sure: this works because we only use the LRU once per kind of objects, and the keys of the different kind of objects are naturally distinct since they are file offsets?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is a good point to bring up!

A problem could arise if a particular offset is re-used for a different object type than what was originally cached, but that can't happen since it is correlated to the ever-growing file offsets of the pack file (as you say). The unified LRU, as it currently is written, is definitely tied to these implementation details. I was trying to avoid a functorized LRU for simplicity, but (thinking out loud) if we move to that, perhaps I can encode the type in the key as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah hm, but having an extensible key type that support custom hashing/equality is a whole new world of pain... which is how I ended up thinking about "not sharing the hashtables" to sidestep the issue :P

@art-w
Copy link
Contributor

art-w commented Jun 9, 2023

To expand a bit on my comments, did you consider keeping the hashtables separate (since they cause type issues), but sharing the Q.t list across the different LRUs? (so that they have the ability to make room from other hashtables when needed)

irmin/src/irmin/lru.ml

Lines 78 to 84 in 3222d26

type 'a t = {
ht : (key * 'a) Q.node HT.t;
q : (key * 'a) Q.t;
mutable cap : int;
mutable w : int;
weight : 'a -> int;
}

The type q: (key * 'a) Q.t is an issue, but I believe it could be replaced by q: (unit -> unit) Q.t instead (a closure to free an element from whichever hashtable it was in and subtracting the removed weight from the total used space...)

(Note that this is just a thought that popped into my head! I don't think there's a strong incentive to do things differently as the current solution works nicely :) )

@metanivek
Copy link
Member Author

Ah, thanks for sharing your extra thoughts! I didn't think about changing the core LRU implementation too much since I wanted to keep options of switching it in the future (maybe a modified cachecache or some lock-free version, for instance). If it looks too complicated to bring the extensible type into the LRU, I think we should push it out to a future time when we evaluate the actual LRU implementation.

metanivek added 6 commits July 4, 2023 14:22
Previously, an `irmin-pack` store had one LRU per `Pack_store`,
resulting in three LRUs corresponding to the three object stores:
Commit, Node, and Content.  This commit changes an `irmin-pack` store to
only have one LRU that is shared by each object store. The motivation of
this change is to make configuring the LRU size more intuitive for users
of `irmin-pack`.
@metanivek metanivek merged commit 78f6cc1 into mirage:main Jul 4, 2023
@metanivek metanivek deleted the new_lru branch July 4, 2023 19:19
metanivek added a commit to metanivek/opam-repository that referenced this pull request Jul 4, 2023
…min-pack, irmin-pack-tools, irmin-mirage, irmin-mirage-graphql, irmin-mirage-git, irmin-http, irmin-graphql, irmin-git, irmin-fs, irmin-containers, irmin-cli, irmin-chunk and irmin-bench (3.8.0)

CHANGES:

### Added

- **irmin**
  - Change behavior of `Irmin.Conf.key` to disallow duplicate key names by
    default. Add `allow_duplicate` optional argument to override. (mirage/irmin#2252,
    @metanivek)

- **irmin-pack**
  - Add maximum memory as an alternative configuration option, `lru_max_memory`,
    for setting LRU capacity. (@metanivek, mirage/irmin#2254)

### Changed

- **irmin**
  - Lower bound for `mtime` is now `2.0.0` (mirage/irmin#2166, @patricoferris)

- **irmin-mirage-git**
  - Lower bound for `mirage-kv` is now `6.0.0` (mirage/irmin#2256, @metanivek)

### Fixed

- **irmin-cli**
  - Changed `--store irf` to `--store fs` to align the CLI with what is
    published on the Irmin website (mirage/irmin#2243, @wyn)
tezoslibrarian pushed a commit to tezos/tezos-mirror that referenced this pull request Dec 15, 2023
Irmin 3.8 unifies 3 LRUs into 1 so we allow 3 more items into the unified LRU

See mirage/irmin#2254
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants