-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Speed up big chainspec json(~1.5 GB) load #10137
Speed up big chainspec json(~1.5 GB) load #10137
Conversation
client/chain-spec/src/chain_spec.rs
Outdated
.map_err(|e| format!("Error opening spec file `{}`: {}", path.display(), e))?; | ||
// We read the entire file into memory first, as this is *a lot* faster than using | ||
// `serde_json::from_reader`. See https://github.com/serde-rs/json/issues/160 | ||
let bytes = std::fs::read(&path).map_err(|e| format!("Error reading spec file: {}", e))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd most likely be better to mmap
the file than to read it whole into memory (especially if it can be a few gigs in size). Could you try mmaping it through the memmap2
crate (we already use it transitively as a dependency through parity-db
anyway) instead? (It should be roughly as fast, but at a significantly lower memory usage.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean having multiple gigabytes in this file is rather rare, however I'm fine with doing these change if we are going to change this here anyway. @koute could you maybe push the required changes to this pr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I know. Still, apparently some people do it, as evidenced by this PR. (:
Sure; done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ty, if you now approve your own changes @koute we should be ready with this pr :D
Does wrapping the |
From this thread: serde-rs/json#160 it seems that With |
The
It could even be faster in certain cases, since loading the file can then run in parallel with parsing it, but yeah, it highly depends on the situation. |
bot merge |
* Speed up chainspec json load * Update client/chain-spec/src/chain_spec.rs * Update client/chain-spec/src/chain_spec.rs * Update client/chain-spec/src/chain_spec.rs * Load the chainspec through `mmap` Co-authored-by: icodezjb <icodezjb@users.noreply.github.com> Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com> Co-authored-by: Jan Bujak <jan@parity.io>
* Speed up chainspec json load * Update client/chain-spec/src/chain_spec.rs * Update client/chain-spec/src/chain_spec.rs * Update client/chain-spec/src/chain_spec.rs * Load the chainspec through `mmap` Co-authored-by: icodezjb <icodezjb@users.noreply.github.com> Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com> Co-authored-by: Jan Bujak <jan@parity.io>
We export our chainspec to
fork.json
byexport-state
of substrate sub cmd,The
fork.json
is a 1.5GB json file.In
ChainSpec::from_json_file
,First
use serde_json::from_file
, loadfork.json
, takes~15 minutes
.While using
serde_json::from_slice
, only takes~2 s
,See serde-rs/json#160