-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Description
Cachegrind profiles indicate that the Rust compiler often spends 3-6% of its executed instructions within memcpy
(specifically __memcpy_avx_unaligned_erms
on my Linux box), which is pretty incredible.
I have modified DHAT to track memcpy
/memmove
calls and have discovered that a lot are caused by obligation types, such as PendingPredicateObligations
and PendingObligations
, which are quite large (160 bytes and 136 bytes respectively on my Linux64 machine).
For example, for the keccak
benchmark, 33% of the copied bytes occur in the swap
call in the compress
function:
rust/src/librustc_data_structures/obligation_forest/mod.rs
Lines 607 to 620 in a6624ed
// Now move all popped nodes to the end. Try to keep the order. | |
// | |
// LOOP INVARIANT: | |
// self.nodes[0..i - dead_nodes] are the first remaining nodes | |
// self.nodes[i - dead_nodes..i] are all dead | |
// self.nodes[i..] are unchanged | |
for i in 0..self.nodes.len() { | |
match self.nodes[i].state.get() { | |
NodeState::Pending | NodeState::Waiting => { | |
if dead_nodes > 0 { | |
self.nodes.swap(i, i - dead_nodes); | |
node_rewrites[i] -= dead_nodes; | |
} | |
} |
For serde
, 11% of the copied bytes occur constructing this vector of obligations:
Lines 150 to 157 in a6624ed
self.out.iter() | |
.inspect(|pred| assert!(!pred.has_escaping_bound_vars())) | |
.flat_map(|pred| { | |
let mut selcx = traits::SelectionContext::new(infcx); | |
let pred = traits::normalize(&mut selcx, param_env, cause.clone(), pred); | |
once(pred.value).chain(pred.obligations) | |
}) | |
.collect() |
and 5% occur appending to this vector of obligations:
rust/src/librustc/traits/project.rs
Lines 570 to 574 in ac21131
obligations.push(get_paranoid_cache_value_obligation(infcx, | |
param_env, | |
projection_ty, | |
cause, | |
depth)); |
It also looks like some functions such as FulfillmentContext::register_predicate_obligation()
might be passed a PredicateObligation
by value (using a memcpy
) rather than by reference, though I'm not sure about that.
I have some ideas to shrink these types a little, and improve how they're used, but these changes will be tinkering around the edges. It's possible that more fundamental changes to how the obligation system works could elicit bigger wins.