-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
steven_blocks: could not compiler under windows? memory allocation of 15728640 bytes failed, STATUS_STACK_BUFFER_OVERRUN #184
Comments
This looks similar to the issue I was hitting in #174, unfortunately steven_blocks requires a lot of memory to compile (I've seen it peak at 3.3 GB) because of the recursive macro expansion. @woodgear How much memory do you have? I test building on a 32 GB macOS system, and build on SourceHut (Linux) and AppVeyor (Windows), but not sure of their memory limits. But I agree this excessive memory requirement is kind of ridiculous (steven_blocks also takes the longest time to build by far), would like to fix it but have yet to find how to improve it, may require rewriting the blocks macro system. |
15728640 bytes = 15 megabytes. I'm now hitting a similar problem on the same machine with Rust 1.40 (#253), not a memory allocation failure, but an unreasonably long build time (2.5+ hr, 25+ GB). This high memory usage in steven_blocks is a longstanding problem, certainly should be addressed. May have to rewrite the whole blocks system to use fewer macros, or is there a way it could be optimized to build reasonably quickly? This will only get worse once more blocks are added, and we are behind. But steven_blocks has became quite complex, I'm not sure what to do here, may need assistance from someone with more expertise with Rust macros… |
Updating to Rust version 1.40 #253 (comment), The bottleneck seems to be in release optimization. Testing to isolate the problem, on nightly with -Z time passes, completes in about 2 minutes. No optimization, stable, also about less than 2 minutes. Returning to nightly and effectively using
|
https://internals.rust-lang.org/t/rust-staticlibs-and-optimizing-for-size/5746/4 mentions https://bugzilla.mozilla.org/show_bug.cgi?id=1386371
Is this still true, or how much would steven_blocks benefit form LTO, is it causing the 125x slowdown or another optimization? https://doc.rust-lang.org/nightly/rustc/codegen-options/index.html#opt-level
Worth noting, steven itself has this optimization profile in Config.toml: [profile.dev]
# Steven runs horrendously slow with no optimizations, and often freezes.
# However, building with full -O3 optimizations takes too long for a debug build.
# Use an -O1 optimization level strikes a good compromise between build and program performance.
opt-level = 1 should steven_blocks have its Cargo.toml specify profile.release opt-level=1 (or 2)? Still waiting on the clean |
Completed in 5 hours 9 minutes this time:
15574.417 is spent on "codegen passes [2w3r8ppd44eh9kz5]". Other hits for 2w3r8ppd44eh9kz5:
15771.815 seconds = 4.38 hours on "codegen passes". This again points to codegen options. |
waiting 12+ minutes, 1.48 GB+ and growing. Killed. update: rerunning, 3.7+ hours, 20+ GB, before I killed it again. Whatever optimization pass is slow, it occurs during opt-level=2. Try this: https://users.rust-lang.org/t/improve-compile-time-and-executable-size-by-counting-lines-of-llvm-ir/14203 https://github.com/dtolnay/cargo-llvm-lines |
get_model, get_flat_offset, get_hierarchical_data, get_collision_boxes, get_model_variant are among the largest in terms of IR lines. All these functions #[allow(unused_variables)]
pub fn get_model(&self) -> (String, String) {
match *self {
$(
Block::$name {
$($fname,)?
} => {
let parts = $model;
(String::from(parts.0), String::from(parts.1))
}
)+
}
} Can attributes be used to control optimization? Maybe not, only code generation attributes are: inline, cold, no_builtins, target_feature. Leaning towards reducing the optimization level in the release profile. At least opt-level=1 is reasonably fast, only about 24 seconds slower than opt-level=0. |
steven_blocks builds fast in |
https://doc.rust-lang.org/cargo/reference/manifest.html#the-profile-sections
|
rust-lang/cargo#1359 added an option to optimize just the dependencies, through config profiles: https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#config-profiles (.cargo/config), which take precedence over manifest files (Cargo.toml). However it is only on nightly. Looks like it will be stabilized in 1.41: https://github.com/rust-lang/cargo/pull/7591/files |
Added config profile, testing with |
https://users.rust-lang.org/t/5-hours-to-compile-macro-what-can-i-do/36508
Maybe there is a better way to fix this. |
* Remove all blocks rm -rf target/release/deps/steven_blocks-* ; time cargo build --release 0.188s * Restore all blocks * Enable macro expansion tracing Requires nightly, `rustup default nightly` The define_blocks! macro expands in about 9 seconds: note: trace_macro --> blocks/src/lib.rs:486:1 | 486 | / define_blocks! { 487 | | Air { 488 | | props {}, 489 | | material material::Material { ... | 5601 | | } 5602 | | } | |_^ | = note: expanding `define_blocks! { Air [............] note: trace_macro --> blocks/src/lib.rs:930:17 | 930 | variant format!("extended={},facing={}", extended, facing.as_string()), | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | = note: expanding `format! { "extended={},facing={}", extended, facing . as_string () }` = note: to `$crate :: fmt :: format ($crate :: __export :: format_args ! ("extended={},facing={}", extended, facing . as_string ()))` ^C Building [=====================================================> ] 317/320: steven_blocks real 0m9.936s user 0m5.460s sys 0m0.444s * Remove block dependencies * Remove only all blocks * Restore some blocks, finishes in about 2s Removes 1000 lines from bottom real 1m43.476s user 4m14.319s sys 0m8.194s * steven_blocks: update lazy_static to 1.4.0 * cargo update steven_blocks * Revert "Restore some blocks, finishes in about 2s" This reverts commit 04aef6a. * Revert "Remove only all blocks", -Z time-passes shows 85s This reverts commit 5e951f5. time cargo rustc -- -Z time-passes time: 13.535 item-bodies checking time: 26.370 metadata encoding and writing time: 28.031 MIR borrow checking time: 3.550 LLVM passes time: 85.748 total * Revert "Remove block dependencies", finishes in ~2m real 1m52.989s user 2m18.885s sys 0m8.025s blocks $ cargo clean ; time cargo rustc -- -Z time-passes This reverts commit 32655a3. * Restore all dependencies with git diff master src * Revert "Enable macro expansion tracing", 1m49s This reverts commit 7461615. * Revert cargo update, but fails: cannot find function `pthread_atfork` in crate `libc` error[E0425]: cannot find function `pthread_atfork` in crate `libc` --> .cargo/registry/src/github.com-1ecc6299db9ec823/rand-0.6.5/src/rngs/adapter/reseeding.rs:320:28 | 320 | unsafe { libc::pthread_atfork(None, None, Some(fork_handler)) }; | ^^^^^^^^^^^^^^ not found in `libc` Compiling quote v0.6.10 Compiling syn v0.15.23 error: aborting due to previous error For more information about this error, try `rustc --explain E0425`. error: could not compile `rand`. warning: build failed, waiting for other jobs to finish... error: build failed real 0m7.047s user 0m20.843s sys 0m2.494s steven_blocks v0.0.1 (steven/blocks) ├── cgmath v0.17.0 │ ├── approx v0.3.2 │ │ └── num-traits v0.2.6 │ ├── num-traits v0.2.6 (*) │ ├── rand v0.6.5 │ │ ├── libc v0.2.9 * Revert "Revert cargo update, but fails: cannot find function `pthread_atfork` in crate `libc`" This reverts commit 326dd2a. * steven_blocks: reduce optimizations in release, closes #184 * Set top-level opt-level=1, compiles in ~5 min * Revert "Set top-level opt-level=1, compiles in ~5 min" This reverts commit 391041e. * Add config profile for steven_blocks * Use profile-overrides feature instead of config profile * Update to nightly Rust until 1.41 (1/30/2020) * Remove comment * Update builds.sr.ht to use +nightly * .build.yml: install nightly * Update to rustc 1.42.0-nightly (760ce94c6 2020-01-04) * Replace into_iter() -> iter() for arrays, fixes 1.42-nightly warning warning: this method call currently resolves to `<&[T; N] as IntoIterator>::into_iter` (due to autoref coercions), but that might change in the future when `IntoIterator` impls for arrays are added. --> src/entity/player.rs:363:11 | 363 | ].into_iter().enumerate() { | ^^^^^^^^^ help: use `.iter()` instead of `.into_iter()` to avoid ambiguity: `iter` | = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release! = note: for more information, see issue #66145 <rust-lang/rust#66145> * Remove deprecated Error description, replaced by Display warning: use of deprecated item 'std::error::Error::description': use the Display impl or to_string() --> src/protocol/mod.rs:981:40 | 981 | Error::IOError(ref e) => e.description(), | ^^^^^^^^^^^ | = note: `#[warn(deprecated)]` on by default * Remove cargo-feature because profile-overrides is stable in nightly warning: the cargo feature `profile-overrides` is now stable and is no longer necessary to be listed in the manifest
From https://wiki.alopex.li/TheStateOfGGEZ2020:
|
Improves fix for #184, whereas #255 reduced optimizations, we now address the underlying compiler limitation and split out the one massive lazy_static! initialization function, into one function per block in the block_registration_functions module. Previous build time, with opt-level=1: % time cargo build --release Compiling steven_blocks v0.0.1 Finished release [optimized] target(s) in 21.24s cargo build --release 31.80s user 0.71s system 152% cpu 21.276 total With this change, opt-level=3 and the function splitting fix: % time cargo build --release Compiling steven_blocks v0.0.1 Finished release [optimized] target(s) in 30.80s cargo build --release 40.26s user 0.86s system 133% cpu 30.850 total Full optimizations are expectedly slightly slower, but this is still much much _much_ faster than before this refactoring, where this crate would take up to an unbelievable 5 hours (and tens of GB of RAM). Long story short, we're now back to full optimizations and stable Rust.
Improves fix for #184, whereas #255 reduced optimizations, we now address the underlying compiler limitation and split out the one massive lazy_static! initialization function, into one function per block in the block_registration_functions module. Previous build time, with opt-level=1: % time cargo build --release Compiling steven_blocks v0.0.1 Finished release [optimized] target(s) in 21.24s cargo build --release 31.80s user 0.71s system 152% cpu 21.276 total With this change, opt-level=3 and the function splitting fix: % time cargo build --release Compiling steven_blocks v0.0.1 Finished release [optimized] target(s) in 30.80s cargo build --release 40.26s user 0.86s system 133% cpu 30.850 total Full optimizations are expectedly slightly slower, but this is still much much _much_ faster than before this refactoring, where this crate would take up to an unbelievable 5 hours (and tens of GB of RAM). Long story short, we're now back to full optimizations and stable Rust. Thanks to dtolnay on the Rust programming language forum for suggesting this technique, https://users.rust-lang.org/t/5-hours-to-compile-macro-what-can-i-do/36508/2
what should i do?
The text was updated successfully, but these errors were encountered: