Get rid of `parity-util-mem` dependency #12658

ordian · 2022-11-09T11:06:05Z

Steps:

Identify all the usage: https://cs.github.com/paritytech/substrate?q=malloc_size
At the moment it seems there are two

substrate (informational only):

Lines 500 to 506 in cbb48e5

    
           fn memory_info(&self) -> StateDbMemoryInfo { 
        
           	StateDbMemoryInfo { 
        
           		non_canonical: MemorySize::from_bytes(malloc_size(&self.non_canonical)), 
        
           		pruning: self.pruning.as_ref().map(|p| MemorySize::from_bytes(malloc_size(&p))), 
        
           		pinned: MemorySize::from_bytes(malloc_size(&self.pinned)), 
        
           	} 
        
           }

polkadot: https://github.com/paritytech/polkadot/blob/a70d12ae1da07167bb33c5fce6f6f4f5ab511fff/node/core/runtime-api/src/cache.rs#L54-L58

Replace the usage by an estimation. It could be something like

substrate/client/db/src/storage_cache.rs

Lines 62 to 72 in 20b5aac

    
           /// Internal trait similar to `heapsize` but using 
        
           /// a simply estimation. 
        
           /// 
        
           /// This should not be made public, it is implementation 
        
           /// detail trait. If it need to become public please 
        
           /// consider using `malloc_size_of`. 
        
           trait EstimateSize { 
        
           	/// Return a size estimation of additional size needed 
        
           	/// to cache this struct (in bytes). 
        
           	fn estimate_size(&self) -> usize; 
        
           }

or more brittle

For each collection, you can estimate size by .capacity() * mem::size_of::<T>().
The only problematic case is nested data-structures, like pinned_insertions: HashMap<BlockHash, (Vec, u32)>,. Here the value size is variable.
A workaround could be adding a struct alongside NonCanonicalOverlay that is tracking size on every insertion/deletion to its field pinned_insertions or simply assuming max size.

Once the usage is replaced, to get rid of the dependency completely, we'd need to remove it from a number of crates in https://github.com/paritytech/trie and https://github.com/paritytech/parity-common.

The text was updated successfully, but these errors were encountered:

koute · 2022-11-10T04:15:15Z

Just a note: when do this we should not forget to add back jemalloc to the relevant binary crates (since it is currently included through parity-util-mem).

bkchr · 2022-11-10T07:13:09Z

We didn't had jemalloc anywhere enabled in Substrate. It always requires that downstream enabled jemalloc in their bins.

mustermeiszer · 2022-11-13T14:05:04Z

Looking forward to having this!

mrcnski · 2022-11-23T14:24:22Z

I can pick this up. Some questions:

What was this crate for and why is it bad?
Are we deleting this crate entirely from parity-common or just the references to it?

ordian · 2022-11-23T18:39:58Z

What was this crate for

Tracking size of allocated memory given a reference in O(1) time.
I vaguely remember some code in parity-ethereum client implementing some logic similar to https://github.com/paritytech/memory-lru (some cache limited in memory size). memory-lru is currently used in runtime-api subsystem cache in polkadot.
Other than that, it's used as a (built-in) poor man's memory profiler, printing granular memory information periodically.
Because of the way it works, it's responsible for setting the global allocator via a feature flag.

why is it bad?

It has several issues.

The biggest one is probably soundness issue (follow the links from PR) of having two multiple versions of the crate, which prompted to ban duplicates: paritytech/parity-common#363. Because we ban duplicates, this causes many troubles for parachain teams (and previously subxt users as well).

Another big issue is how breaking changes cascade (as it's used in many public APIs itself) once we modify one of its dependencies. Rel. paritytech/parity-common#377 (but not only that).

Are we deleting this crate entirely from parity-common or just the references to it?

First references, then entirely. I don't think anyone depends on this crate really. Original codes comes from servo. Some of it was extracted for general purpose, but doesn't seem maintained: https://github.com/bholley/malloc_size_of_derive.

The tests that were introduced in #4822 can be removed while we remove references.

mrcnski · 2022-11-24T18:07:22Z

So we considered removing the mem_used info completely. This is where it's actually used:

substrate/client/state-db/src/lib.rs

Lines 395 to 397 in 4e3a12b

    
           if constraints.max_mem.map_or(false, |m| pruning.mem_used() > m) { 
        
           	break 
        
           }

But looks like that is always 0...

substrate/client/state-db/src/pruning.rs

Lines 348 to 350 in 4e3a12b

    
           pub fn mem_used(&self) -> usize { 
        
           	0 
        
           }

I did a double-take, but looks like that's the method being called. So I think that break above is always triggered.

I'm considering if we can just remove PruningMode::Constrained maybe, because I ran into a wall with estimating the size of this thing:

substrate/client/state-db/src/pruning.rs

Lines 54 to 70 in 4e3a12b

    
           enum DeathRowQueue<BlockHash: Hash, Key: Hash, D: MetaDb> { 
        
           	Mem { 
        
           		/// A queue of keys that should be deleted for each block in the pruning window. 
        
           		death_rows: VecDeque<DeathRow<BlockHash, Key>>, 
        
           		/// An index that maps each key from `death_rows` to block number. 
        
           		death_index: HashMap<Key, u64>, 
        
           	}, 
        
           	DbBacked { 
        
           		// The backend database 
        
           		#[ignore_malloc_size_of = "Shared data"] 
        
           		db: D, 
        
           		/// A queue of keys that should be deleted for each block in the pruning window. 
        
           		/// Only caching the first few blocks of the pruning window, blocks inside are 
        
           		/// successive and ordered by block number 
        
           		cache: VecDeque<DeathRow<BlockHash, Key>>, 
        
           		/// A soft limit of the cache's size 
        
           		cache_capacity: usize,

There's a couple nested data structures there, as well as a MetaDb thing. 😱 (edit: just noticed the annotate to ignore the memory of the db)

But anyway, seems like pruning is currently a no-op? If so, we can remove the mem used info and don't have to bother with the estimations. Not sure if the constrained mode is ever set anyway; looks like the mode ultimately comes from a config.

cc @arkpar

arkpar · 2022-11-25T08:22:05Z

Memory constraint for state-db is currently not used. As in not exposed to the CLI options. There was not much demand for is, as having predictable pruning history size is preferable. I guess we can remove it.

mrcnski · 2022-11-25T20:32:30Z

Current Summary

substrate

Removed the dep from substrate as well as mem info, but I ran into this error:

note: required by a bound in `KeyValueDB`
   --> /Users/parity/.cargo/registry/src/github.com-1ecc6299db9ec823/kvdb-0.12.0/src/lib.rs:107:37
    |
107 | pub trait KeyValueDB: Sync + Send + parity_util_mem::MallocSizeOf {
    |                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ required by this bound in `KeyValueDB`

so we will need to remove the dep from kvdb in parity-common, first.

polkadot (please help)

Removing the dep from polkadot was going well (apart from the same kvdb error).

Then I ran into this ResidentSize thing:

https://github.com/paritytech/polkadot/blob/master/node/core/runtime-api/src/cache.rs#L54-L58

Removing all references to that thing results in a million errors, because MemoryLruCache from memory-lru requires it. And if we were to keep ResidentSize here, there's a million types here we would need to estimate the size for.

Also, I have a question about whether we need parity-util-mem just to maintain backward compatibility? e.g. this type:

https://github.com/paritytech/polkadot/blob/master/primitives/src/v2/mod.rs#L1765-L1769

Also 2: Note that I had to remove some collect_memory_stats thing from the overseer, as well.

parity-common

Removed the dep here easily enough.

trie (please help)

Started removing the dep from trie, but found that the memory-db crate is pretty much built around it. memory-db seems pretty heavily used in the trie repo (and is on crates.io) so I will need some guidance on what to do here.

bkchr · 2022-11-25T23:00:01Z

Removing all references to that thing results in a million errors, because MemoryLruCache from memory-lru requires it. And if we were to keep ResidentSize here, there's a million types here we would need to estimate the size for.

You will need to implement some new trait, maybe directly this ResidentSize trait. However, for now it could probably be done local to the trait declaration?

ordian · 2022-11-26T01:06:55Z

TBH, I feel like we should get rid of memory-lru usage for runtime-api cache. Instead use a regular lru cache with a reasonable enough number of entries we can keep in memory for each type.

ordian · 2022-11-26T01:08:00Z

Also, I have a question about whether we need parity-util-mem just to maintain backward compatibility?

No, we don't. In fact this type can be removed completely along with old runtime api, but out of scope of this PR.

ordian · 2022-11-26T01:10:58Z

but found that the memory-db crate is pretty much built around it

What do you mean? Why can just remove it? IIRC the malloc_tracker is there only as an optimization to allow for O(1) calculation. But the size_of is only used for information purposes only AFAIK, so can be removed unless I missed something.

mrcnski · 2022-11-28T14:20:02Z

but found that the memory-db crate is pretty much built around it

What do you mean? Why can just remove it? IIRC the malloc_tracker is there only as an optimization to allow for O(1) calculation. But the size_of is only used for information purposes only AFAIK, so can be removed unless I missed something.

@ordian Okay, just raised the PR. I was seeing the code for the first time and it looked like the mem tracking was all over it. But yeah, in the end it was unused.

paritytech/trie#172

Should I also bump the versions etc. in that PR?

mrcnski · 2022-11-28T16:37:03Z

Removing all references to that thing results in a million errors, because MemoryLruCache from memory-lru requires it. And if we were to keep ResidentSize here, there's a million types here we would need to estimate the size for.

You will need to implement some new trait, maybe directly this ResidentSize trait. However, for now it could probably be done local to the trait declaration?

To do this we would need to provide size estimates for all the types, right? Doing that seems non-trivial, as some of the types are e.g. Vecs of Vecs, or even more complex. (That rhymed!)

TBH, I feel like we should get rid of memory-lru usage for runtime-api cache. Instead use a regular lru cache with a reasonable enough number of entries we can keep in memory for each type.

This makes sense to me. I just need help picking the number of entries, then I'll be good to go. Maybe 1024 for all? Current sizes in bytes:

https://github.com/paritytech/polkadot/blob/master/node/core/runtime-api/src/cache.rs#L31-L50

ordian · 2022-11-29T10:43:46Z

Maybe 1024 for all?

I'd prefer different sizes. For example, SessionInfo can be 10s or 100s of KiB and we cache it in the RollingSessionWindow anyway. Maybe 128?

SessionIndex is tiny and caching 1000s of them is fine. I don't think we need to cache more than 1k items in general if we assume two forks and up to 500 blocks finality lag.

Polkadot-Forum · 2022-12-22T09:09:21Z

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/polkadot-release-analysis-v0-9-36/1529/1

ordian added I7-refactor Code needs refactoring. Z6-mentor An easy task where a mentor is available. Please indicate in the issue who the mentor could be. Z2-medium Can be fixed by a coder with good Rust knowledge but little knowledge of the codebase. E3-dependencies labels Nov 9, 2022

ascjones mentioned this issue Nov 14, 2022

Update substrate crates paritytech/subxt#709

Merged

niklasad1 mentioned this issue Nov 15, 2022

release sp-core 7.0.0 and sp-runtime 7.0.0 #12599

Merged

mrcnski mentioned this issue Nov 28, 2022

memory-db: remove parity-util-mem paritytech/trie#172

Merged

This was referenced Nov 28, 2022

Remove parity-util-mem paritytech/parity-common#696

Merged

Remove mem_info and references to parity-util-mem #12795

Merged

mrcnski mentioned this issue Nov 29, 2022

Remove parity-util-mem from runtime-api cache paritytech/polkadot#6367

Merged

ordian linked a pull request Nov 29, 2022 that will close this issue

Remove mem_info and references to parity-util-mem #12795

Merged

mrcnski self-assigned this Nov 30, 2022

mrcnski mentioned this issue Dec 1, 2022

Companion for paritytech/substrate#12795 paritytech/polkadot#6374

Merged

paritytech-processbot bot closed this as completed in #12795 Dec 6, 2022

mrcnski mentioned this issue Dec 6, 2022

Remove parity-util-mem paritytech/cumulus#1958

Merged

xlc mentioned this issue Dec 6, 2022

ensure parity-util-mem is removed AcalaNetwork/Acala#2435

Closed

pinkforest mentioned this issue Jan 15, 2023

parity-util-mem soundness rustsec/advisory-db#1399

Closed

ordian mentioned this issue Sep 11, 2024

Bump parity-common paritytech/polkadot-sdk#5673

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get rid of `parity-util-mem` dependency #12658

Get rid of `parity-util-mem` dependency #12658

ordian commented Nov 9, 2022

koute commented Nov 10, 2022

bkchr commented Nov 10, 2022

mustermeiszer commented Nov 13, 2022

mrcnski commented Nov 23, 2022

ordian commented Nov 23, 2022 •

edited

Loading

mrcnski commented Nov 24, 2022 •

edited

Loading

arkpar commented Nov 25, 2022

mrcnski commented Nov 25, 2022

bkchr commented Nov 25, 2022

ordian commented Nov 26, 2022

ordian commented Nov 26, 2022

ordian commented Nov 26, 2022

mrcnski commented Nov 28, 2022 •

edited

Loading

mrcnski commented Nov 28, 2022

ordian commented Nov 29, 2022

Polkadot-Forum commented Dec 22, 2022

Get rid of parity-util-mem dependency #12658

Get rid of parity-util-mem dependency #12658

Comments

ordian commented Nov 9, 2022

koute commented Nov 10, 2022

bkchr commented Nov 10, 2022

mustermeiszer commented Nov 13, 2022

mrcnski commented Nov 23, 2022

ordian commented Nov 23, 2022 • edited Loading

mrcnski commented Nov 24, 2022 • edited Loading

arkpar commented Nov 25, 2022

mrcnski commented Nov 25, 2022

Current Summary

substrate

polkadot (please help)

parity-common

trie (please help)

bkchr commented Nov 25, 2022

ordian commented Nov 26, 2022

ordian commented Nov 26, 2022

ordian commented Nov 26, 2022

mrcnski commented Nov 28, 2022 • edited Loading

mrcnski commented Nov 28, 2022

ordian commented Nov 29, 2022

Polkadot-Forum commented Dec 22, 2022

Get rid of `parity-util-mem` dependency #12658

Get rid of `parity-util-mem` dependency #12658

ordian commented Nov 23, 2022 •

edited

Loading

mrcnski commented Nov 24, 2022 •

edited

Loading

mrcnski commented Nov 28, 2022 •

edited

Loading