-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Too many memcpy
s
#64301
Comments
#64302 shrinks the size of some of these types by 24 bytes. For a clean check build of |
memcpy
s of obligations, which are largememcpy
s
I'm going to modify this issue to be about all types that cause many calls to
|
|
This is an interesting case: rust/src/librustc_typeck/check/coercion.rs Lines 551 to 557 in eb48d6b
DHAT tells me that each time this executes (which is often) we do a The code for It calls @SimonSapin: does this make sense to you? Is there a way to only initialize 136*1 + 8 = 144 bytes, instead of 136*4 + 8 = 552 bytes? (BTW, #64302 shrinks |
@nnethercote does your profiling show "returning structs by value" to be a significant amount of time spent in memcpy? Since Rust doesn't have guaranteed RVO/NRVO (see #32966 for details), I've been trying to get a feel for what kind of performance impact implementing this in MIR would cause. This could perhaps be the cause of your mystery memcpy in #64301 (comment), as |
#64374 was one such case. In general, it doesn't seem like a big fraction of the |
Referencing https://github.com/kvark/copyless here, since it can be used to avoid some extra copies. |
A single 544-bytes This is not the first I hear of accidentally moving a large buffer with SmallVec, I feel this is a design flaw of this library. I’ve been pondering an alternative design (perhaps for a separate crate) where the inline buffer is borrowed: let mut buffer = std::mem::Uninitialized::<Foo, N>::uninit();
let queue = SmallSmallVec::new(&mut buffer); This gives a |
#67250 is another tiny win from |
#67340 is another small win from |
At this point I have eliminated most of the worst cases and even those have produced only small wins. I will keep an eye out for additional cases in the future, but I don't think we don't need an issue open for it any more. |
Cachegrind profiles indicate that the Rust compiler often spends 3-6% of its executed instructions within
memcpy
(specifically__memcpy_avx_unaligned_erms
on my Linux box), which is pretty incredible.I have modified DHAT to track
memcpy
/memmove
calls and have discovered that a lot are caused by obligation types, such asPendingPredicateObligations
andPendingObligations
, which are quite large (160 bytes and 136 bytes respectively on my Linux64 machine).For example, for the
keccak
benchmark, 33% of the copied bytes occur in theswap
call in thecompress
function:rust/src/librustc_data_structures/obligation_forest/mod.rs
Lines 607 to 620 in a6624ed
For
serde
, 11% of the copied bytes occur constructing this vector of obligations:rust/src/librustc/ty/wf.rs
Lines 150 to 157 in a6624ed
and 5% occur appending to this vector of obligations:
rust/src/librustc/traits/project.rs
Lines 570 to 574 in ac21131
It also looks like some functions such as
FulfillmentContext::register_predicate_obligation()
might be passed aPredicateObligation
by value (using amemcpy
) rather than by reference, though I'm not sure about that.I have some ideas to shrink these types a little, and improve how they're used, but these changes will be tinkering around the edges. It's possible that more fundamental changes to how the obligation system works could elicit bigger wins.
The text was updated successfully, but these errors were encountered: