-
Notifications
You must be signed in to change notification settings - Fork 1.7k
enable lto, panic = "abort" in release build #10356
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome but it increases the build-time significantly (roughly 2x) for me but definitely worth the effort, however I'm not really aware of the exact differences between panic=unwind
and panic=abort
I'll leave it to the pro's
# this branch
➜ parity-ethereum git:(optimize-release-build) ✗ size --format=SysV ./target/release/parity
size --format=SysV ./target/release/parity
./target/release/parity :
section size addr
.interp 28 736
.note.ABI-tag 32 764
.note.gnu.build-id 36 796
.gnu.hash 1004 832
.dynsym 9744 1840
.dynstr 9346 11584
.gnu.version 812 20930
.gnu.version_r 768 21744
.rela.dyn 1237392 22512
.rela.plt 4584 1259904
.init 23 1265664
.plt 3072 1265696
.plt.got 504 1268768
.text 25521701 1269312
.fini 9 26791016
.rodata 4913113 26791936
.eh_frame_hdr 168428 31705052
.eh_frame 1142048 31873480
.gcc_except_table 66692 33015528
.tdata 818 33089920
.tbss 680 33090752
.init_array 1064 33090752
.fini_array 8 33091816
.data.rel.ro 980640 33091840
.dynamic 624 34072480
.got 5616 34073104
.data 11408 34078720
.bss 31744 34090144
.comment 56 0
Total 34111994
# master
parity-ethereum git:(master) ✗ size --format=SysV ./target/release/parity
./target/release/parity :
section size addr
.interp 28 736
.note.ABI-tag 32 764
.note.gnu.build-id 36 796
.gnu.hash 1004 832
.dynsym 9960 1840
.dynstr 9537 11800
.gnu.version 830 21338
.gnu.version_r 800 22168
.rela.dyn 1708824 22968
.rela.plt 4608 1731792
.init 23 1736704
.plt 3088 1736736
.plt.got 504 1739824
.text 28453253 1740352
.fini 9 30193608
.rodata 8373294 30195712
.eh_frame_hdr 525348 38569008
.eh_frame 3148480 39094360
.gcc_except_table 1451248 42242840
.tdata 752 43701120
.tbss 672 43701888
.init_array 1064 43701888
.fini_array 8 43702952
.data.rel.ro 1221272 43702976
.dynamic 624 44924248
.got 49208 44924872
.data 11466 44974080
.bss 31760 44985568
.comment 56 0
Total 45007788
Don't we
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last time when we disabled abort, besides building fails, one of the other reason is that we got bug reports that some state trie items are missing, or blocks are missing, which triggers the client to panic. One of the probable reason is that the client doesn't shutdown correctly (through another panic, for example), and left in the middle of an import. In that case, we can be in a situation where we commit some metadata, but not actually committed the block itself, or we may in the middle of committing the state trie. Using rewind would potentially allows the shutdown to be happen more gracefully in case a panic happened (while the panicking thread would fail, other threads can receive the shutdown signal and do stuff if needed), but I also haven't really done deep analysis of how much benefits we would gain with rewind.
We also discussed enabling lto last time. The reason why we didn't at that time was that devops worried that lto would make release build machine fail, because it can take a large amount of memory. But if devops confirm that this is fine now (if we have upgraded our machines!), then this shouldn't be an issue any more.
Regarding LTO:
Looks like the overall time for building + running tests hasn't changed much. Max RSS does not seem to be too large. Now on to
|
When I last checked (around 2017) Anyway, panic and abort in the middle of an import won't lead to main database corruption because transaction atomicity is guaranteed by rocksdb, and as far as I know all of the block import is done in a single transaction. |
Ok, it seems like we all agree on the LTO change, if someone from devops can confirm it won't kill our machines (@gabreal ?).
This matches what @debris said here:
If we agree on |
As far as I can see, there should be no problem of memory anymore, but be aware that the test job run time increases approx about 7min (for this job there is only one sample). @TriplEight might also like to know. |
tests are run in release mode afaik. |
Thanks for mentioning @gabreal .
Yes and here. And would be really good if we could avoid it to make this features work only for the release builds. Currently merging this will slow down our already slow test pipelines. Please read about
Regarding
That way we are refusing bug stack traces from users. |
That's why I suggested to only backport it to stable and beta, w/o merging into master.
catch panics != stack traces, we'll see stack traces |
That way it makes sense @ordian . |
Just had a talk with Pierre. He suggested that we should set the
that way we can get rid of |
It would make tests faster to compile but slower to run, I think the reason we're using |
I'll try this |
In fact I changed that a while ago to have the tests perform as close as possible as they are used in the wild. |
94e5824
to
3866c81
Compare
3866c81
to
e91ff1f
Compare
I've split the change into two commits, the second commit also removes the My suggestion is to backport the first commit to stable and both to beta. |
@gabreal @ordian a comparsion report: https://drive.google.com/open?id=159CBQGiKOPz9JBfda_n9bLFNF10qIif3Fyz5DoyWBjk |
* fix broken sync * correct seal fields * ethcore: fix comment * parity: remove duplicate params * clique: fix whitespaces * ethcore: fix goerli chain spec * refactor signer_snapshot into pending/finalized state * move close_block_extra_data after seal is applied * refactor most of the logic into the signer_snapshot * clique: refactor locking logic out of the consensus engine interface * Fix jsonspec and add an unittest * Replace space with tabs * Unbroke sync * Fix broken sync * 1/2 state tracking without votes * 2/2 implement vote tracking * ci: use travis for goerli * ci: setup a clique network * ci: sync a görli node * add clique deploy script * ci: fix paths in clique deploy script * ci: use docker compose * ci: fix travis job names * ci: fix build deps * ci: massively reduce tests * Revert "ci: massively reduce tests" This reverts commit 6369f0b. * ci: run cargo test directly * ci: separate build and test stages * ci: cache rust installation * ci: simplify ci stages * ci: make clique deploy script executable * ci: shutdown goerli sync after 20min * ci: remove slow sync stage * ci: use timeout to finish jobs * ci: fix build path * ci: use absolute paths to end this confusion * ci: add geth and parity to path * ci: be more verbose * ci: allow for more relaxed caching timeout * ci: update repositories for custom ppa * ci: fix typo in file name * ci: fix docker compose file * ci: add ethkey to docker * ci: make sure deploy script is up to date with upstream * ci: stop docker container after certain time * ci: force superuser to update permissions on docker files * ci: reduce run time of script to ~30 min * ci: remove duplicate caching in travis * remove trace statements * clique: add more validation involving the recent signer list * ethcore: enable constantinople for rinkeby * ethcore: fix whitespaces in rinkeby spec * ethcore: reformat goerli.json * Revert "ci: remove duplicate caching in travis" This reverts commit a562838. * tmp commit * another tmp commit * it builds! * add sealing capabilities * add seal_header hook to allow separation of block seal/importing code paths * clique: remove populate_from_parent. * add panic * make turn delay random * initialize OpenBlock properly in 'enact' * misc: remove duplicate lines * misc: fix license headers * misc: convert spaces to tabs * misc: fix tabs * Update Cargo.toml * Update Cargo.toml * Update Cargo.toml * clique: ensure validator restores state before trying to seal * clique: make 'state' return an Error. Make some error messages more clear * Fix compile error after rebase & toolchain upgrade * fix a bunch of import warnings * Refactor code * Fix permissions * Refactoring syncing * Implement full validator checks * Refactor util functions to seperate file * mining 1 * ethcore: add chainspec for kotti * ethcore: rename pre-goerli configs * ethcore: load kotti chain spec * cli: add kotti to params * Implement working local sealing * making sealing & syncing work together * Relax timestamp checking * ethcore: prepare for the real goerli to launch * Implement NOTURN wiggle properly & cleanupnup warnings * Implement vote casting * Update docs & skip signing if no signer * Optimize step-service interval * Record state on local sealed block * Fix script filemode * Cleaning up codebase * restore enact trace logging * Delete clique.sh and move sync.sh * remove travis.yml * Remove dead code * Cleanup compile warning * address review comments * adding more comments and removing unwrap() * ci: remove sync script * Address review comments * fix compile error * adding better debugging for timing * Implement an dedicated thread for sealing timing * fix(add helper for timestamp overflows) (#10330) * fix(add helper timestamp overflows) * fix(simplify code) * fix(make helper private) * snap: official image / test (#10168) * official image / test * fix / test * bit more necromancy * fix paths * add source bin/df /test * add source bin/df /test2 * something w paths /test * something w paths /test * add source-type /test * show paths /test * copy plugin /test * plugin -> nil * install rhash * no questions while installing rhash * publish snap only for release * fix(docker): fix not receives SIGINT (#10059) * fix(docker): fix not receives SIGINT * fix: update with reviews * update with review * update * update * Don't add discovery initiators to the node table (#10305) * Don't add discovery initiators to the node table * Use enums for tracking state of the nodes in discovery * Dont try to ping ourselves * Fix minor nits * Update timeouts when observing an outdated node * Extracted update_bucket_record from update_node * Fixed typo * Fix two final nits from @todr * change docker image based on debian instead of ubuntu due to the chan… (#10336) * change docker image based on debian instead of ubuntu due to the changes of the build container * role back docker build image and docker deploy image to ubuntu:xenial based (#10338) * Bundle protocol and packet_id together in chain sync (#10315) Define a new `enum` where devp2p subprotocol packet ids (currently eth and par) are defined. Additionally provide functionality to query id value and protocol of a given id object. * snap: prefix version and populate candidate channel (#10343) * snap: populate candidate releases with beta snaps to avoid stale channel * snap: prefix version with v* * addressing review comments * engine: fix copyright header * scripts: restore permissions on sign command * ethcore: enforce tabs * ethcore: enforce tabs * ethcore: enforce tabs * addressing comments * addressing comments * addressing more comments * addressing more comments * addressing more comments * addressing more comments * addressing more comments * json-spec: fix clique epoch to non-zero u64 * ci: enable travis for parity goerli * ci: don't separate build and test step * ci: don't run c++ tests on travis * ci: simplify cargo test to squeeze into travis timeout * ci: don't run tests on travis at all * style(fixes) * fix(add tests) * fix(recent_signer bug) * fix(complete all tests) * fix(nits) * fix(simplify asserts) * fix(cliqueState): simplify code * fix(nits) * docs(comments what's need to fixed) * fix(revert unintended changes) * fix(tests) * fix(logs): voting logs * fix(readability + more logs) * fix(sync) * docs(add missing licens header) * fix(log): info! -> trace! * docs(fix nits) + fix(remove assert) * perf(use counter instead of vec) * fix(remove needless block in match) * fix(faulty comment) * grumbles(docs for tests) * fix(nits) * fix(revert_vote): only remove vote when votes == 0 * fix(vote counter): checked arithmetics * fix(simplify tests) * fix(nits) * fix(clique): err types * fix(clique utils): make use of errors * fix(cleanup nits) * fix(clique sealing): don't read state no signer * fix(replace Vec<Signers> with BTreeSet<Signers>) * fix(tests): BTreeSet and more generic helpers * fix(nits) * fix(ethcore_block_seal): remove needless `Box` * fix(faulty log): info -> trace * fix(checked SystemTime): prevent SystemTime panics * style(chain cfg): space after `:` * style(fn enact): fix whitespace * docs(clique): StepService * docs(nit): fix faulty comment * docs(fix typo) * style(fix bad indentation) * fix(bad regex match) * grumble(on_seal_block): make `&mut` to avoid clone * docs(on_seal_block): fix faulty documentation * Delete .travis.yml * docs: remove eth hf references in spec * Update client.rs * fix(nits) * fix(clique step): `RwLock` -> `AtomicBool` * fix(clique): use `Duration::as_millis` * Clean up some Clique documentation
* Add trace information to eth_estimateGas * replace unwrap better version * change vm::Error formatter to more user-friendly * remove extra error format * use map_or instead sequence of map/unwrap_or
To my ears it sounds like we want Panic handling seems more contentious and it sounds like we need more testing and hand-wringing on that one. @ordian can you PR those two changes separately? The diff here is a little confusing with all the Clique code in. |
I'm not sure how to enable it this way, since we run tests with
Yeah, that's due to force rebase on master. |
Forgive me, but I'm slow. Can't we stop running tests that way and instead switch to:
…i.e. switch CI to run |
@dvdplm yes, this thread is hard.
AFAIU we can do that. |
@dvdplm, @ordian something went wrong, I suggest falling back. |
@TriplEight the links are the same there, not sure where to look. Can fix? :) |
@dvdplm I edited in the GitHub UI |
RUSTFLAGS=" -C link-arg=-s" cargo build --release
, -8MB thanks to LTO)Last time we disabled
panic = "abort"
in #8983 (cc @sorpaas) due to failing build.