Skip to content

Commit f1666de

Browse files
committed
Remove most #[inline] annotations
This commit goes through and deletes almost all `#[inline]` annotations in this crate. It looks like before this commit basically every single function is `#[inline]`, but this is generally not necessary for performance and can have a severe impact on compile times in both debug and release modes, most severely in release mode. Some `#[inline]` annotations are definitely necessary, however. Most functions in this crate are already candidates for inlining because they're generic, but functions like `Group` and `BitMask` aren't candidates for inlining without `#[inline]`. Additionally LLVM is by no means perfect, so some `#[inline]` may still be necessary to get some further speedups. The procedure used to generate this commit looked like: * Remove all `#[inline]` annotations. * Run `cargo bench`, comparing against the `master` branch, and add `#[inline]` to hot spots as necessary. * A [PR] was made against rust-lang/rust to [evaluate the impact][run1] on the compiler for more performance data. * Using this data, `perf diff` was used locally to determine further hot spots and more `#[inline]` annotations were added. * A [second round of benchmarking][run2] was done The numbers are at the point where I think this should land in the crate and get published to move into the standard library. There are up to 20% wins in compile time for hashmap-heavy crates (like Cargo) and milder wins (up to 10%) for a number of other large crates. The regressions are all in the 1-3% range and are largely on benchmarks taking a few handful of milliseconds anyway, which I'd personally say is a worthwhile tradeoff. For comparison, the benchmarks of this crate before and after this commit look like so: name baseline ns/iter new ns/iter diff ns/iter diff % speedup insert_ahash_highbits 7,137 9,044 1,907 26.72% x 0.79 insert_ahash_random 7,575 9,789 2,214 29.23% x 0.77 insert_ahash_serial 9,833 9,476 -357 -3.63% x 1.04 insert_erase_ahash_highbits 15,824 19,164 3,340 21.11% x 0.83 insert_erase_ahash_random 16,933 20,353 3,420 20.20% x 0.83 insert_erase_ahash_serial 20,857 27,675 6,818 32.69% x 0.75 insert_erase_std_highbits 35,117 38,385 3,268 9.31% x 0.91 insert_erase_std_random 35,357 37,236 1,879 5.31% x 0.95 insert_erase_std_serial 30,617 34,136 3,519 11.49% x 0.90 insert_std_highbits 15,675 18,180 2,505 15.98% x 0.86 insert_std_random 16,566 17,803 1,237 7.47% x 0.93 insert_std_serial 14,612 16,025 1,413 9.67% x 0.91 iter_ahash_highbits 1,715 1,640 -75 -4.37% x 1.05 iter_ahash_random 1,721 1,634 -87 -5.06% x 1.05 iter_ahash_serial 1,723 1,636 -87 -5.05% x 1.05 iter_std_highbits 1,715 1,634 -81 -4.72% x 1.05 iter_std_random 1,715 1,637 -78 -4.55% x 1.05 iter_std_serial 1,722 1,637 -85 -4.94% x 1.05 lookup_ahash_highbits 4,565 5,809 1,244 27.25% x 0.79 lookup_ahash_random 4,632 4,047 -585 -12.63% x 1.14 lookup_ahash_serial 4,612 4,906 294 6.37% x 0.94 lookup_fail_ahash_highbits 4,206 3,976 -230 -5.47% x 1.06 lookup_fail_ahash_random 4,327 4,211 -116 -2.68% x 1.03 lookup_fail_ahash_serial 8,999 4,386 -4,613 -51.26% x 2.05 lookup_fail_std_highbits 13,284 13,342 58 0.44% x 1.00 lookup_fail_std_random 13,172 13,614 442 3.36% x 0.97 lookup_fail_std_serial 11,240 11,539 299 2.66% x 0.97 lookup_std_highbits 13,075 13,333 258 1.97% x 0.98 lookup_std_random 13,257 13,193 -64 -0.48% x 1.00 lookup_std_serial 10,782 10,917 135 1.25% x 0.99 The summary of this from what I can tell is that the microbenchmarks are sort of all over the place, but they're neither consistently regressing nor improving, as expected. In general I would be surprised if there's much of a significant performance regression attributed to this commit, and `#[inline]` can always be selectively added back in easily without adding it to every function in the crate. [PR]: rust-lang/rust#64846 [run1]: rust-lang/rust#64846 (comment) [run2]: rust-lang/rust#64846 (comment)
1 parent 7c93e56 commit f1666de

File tree

11 files changed

+0
-294
lines changed

11 files changed

+0
-294
lines changed

src/external_trait_impls/rayon/map.rs

-17
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,6 @@ pub struct ParIter<'a, K, V, S> {
2222
impl<'a, K: Sync, V: Sync, S: Sync> ParallelIterator for ParIter<'a, K, V, S> {
2323
type Item = (&'a K, &'a V);
2424

25-
#[inline]
2625
fn drive_unindexed<C>(self, consumer: C) -> C::Result
2726
where
2827
C: UnindexedConsumer<Self::Item>,
@@ -39,7 +38,6 @@ impl<'a, K: Sync, V: Sync, S: Sync> ParallelIterator for ParIter<'a, K, V, S> {
3938
}
4039

4140
impl<K, V, S> Clone for ParIter<'_, K, V, S> {
42-
#[inline]
4341
fn clone(&self) -> Self {
4442
ParIter { map: self.map }
4543
}
@@ -65,7 +63,6 @@ pub struct ParKeys<'a, K, V, S> {
6563
impl<'a, K: Sync, V: Sync, S: Sync> ParallelIterator for ParKeys<'a, K, V, S> {
6664
type Item = &'a K;
6765

68-
#[inline]
6966
fn drive_unindexed<C>(self, consumer: C) -> C::Result
7067
where
7168
C: UnindexedConsumer<Self::Item>,
@@ -79,7 +76,6 @@ impl<'a, K: Sync, V: Sync, S: Sync> ParallelIterator for ParKeys<'a, K, V, S> {
7976
}
8077

8178
impl<K, V, S> Clone for ParKeys<'_, K, V, S> {
82-
#[inline]
8379
fn clone(&self) -> Self {
8480
ParKeys { map: self.map }
8581
}
@@ -105,7 +101,6 @@ pub struct ParValues<'a, K, V, S> {
105101
impl<'a, K: Sync, V: Sync, S: Sync> ParallelIterator for ParValues<'a, K, V, S> {
106102
type Item = &'a V;
107103

108-
#[inline]
109104
fn drive_unindexed<C>(self, consumer: C) -> C::Result
110105
where
111106
C: UnindexedConsumer<Self::Item>,
@@ -119,7 +114,6 @@ impl<'a, K: Sync, V: Sync, S: Sync> ParallelIterator for ParValues<'a, K, V, S>
119114
}
120115

121116
impl<K, V, S> Clone for ParValues<'_, K, V, S> {
122-
#[inline]
123117
fn clone(&self) -> Self {
124118
ParValues { map: self.map }
125119
}
@@ -147,7 +141,6 @@ pub struct ParIterMut<'a, K, V, S> {
147141
impl<'a, K: Send + Sync, V: Send, S: Send> ParallelIterator for ParIterMut<'a, K, V, S> {
148142
type Item = (&'a K, &'a mut V);
149143

150-
#[inline]
151144
fn drive_unindexed<C>(self, consumer: C) -> C::Result
152145
where
153146
C: UnindexedConsumer<Self::Item>,
@@ -185,7 +178,6 @@ pub struct ParValuesMut<'a, K, V, S> {
185178
impl<'a, K: Send, V: Send, S: Send> ParallelIterator for ParValuesMut<'a, K, V, S> {
186179
type Item = &'a mut V;
187180

188-
#[inline]
189181
fn drive_unindexed<C>(self, consumer: C) -> C::Result
190182
where
191183
C: UnindexedConsumer<Self::Item>,
@@ -220,7 +212,6 @@ pub struct IntoParIter<K, V, S> {
220212
impl<K: Send, V: Send, S: Send> ParallelIterator for IntoParIter<K, V, S> {
221213
type Item = (K, V);
222214

223-
#[inline]
224215
fn drive_unindexed<C>(self, consumer: C) -> C::Result
225216
where
226217
C: UnindexedConsumer<Self::Item>,
@@ -249,7 +240,6 @@ pub struct ParDrain<'a, K, V, S> {
249240
impl<K: Send, V: Send, S: Send> ParallelIterator for ParDrain<'_, K, V, S> {
250241
type Item = (K, V);
251242

252-
#[inline]
253243
fn drive_unindexed<C>(self, consumer: C) -> C::Result
254244
where
255245
C: UnindexedConsumer<Self::Item>,
@@ -268,28 +258,24 @@ impl<K: fmt::Debug + Eq + Hash, V: fmt::Debug, S: BuildHasher> fmt::Debug
268258

269259
impl<K: Sync, V: Sync, S: Sync> HashMap<K, V, S> {
270260
/// Visits (potentially in parallel) immutably borrowed keys in an arbitrary order.
271-
#[inline]
272261
pub fn par_keys(&self) -> ParKeys<'_, K, V, S> {
273262
ParKeys { map: self }
274263
}
275264

276265
/// Visits (potentially in parallel) immutably borrowed values in an arbitrary order.
277-
#[inline]
278266
pub fn par_values(&self) -> ParValues<'_, K, V, S> {
279267
ParValues { map: self }
280268
}
281269
}
282270

283271
impl<K: Send, V: Send, S: Send> HashMap<K, V, S> {
284272
/// Visits (potentially in parallel) mutably borrowed values in an arbitrary order.
285-
#[inline]
286273
pub fn par_values_mut(&mut self) -> ParValuesMut<'_, K, V, S> {
287274
ParValuesMut { map: self }
288275
}
289276

290277
/// Consumes (potentially in parallel) all values in an arbitrary order,
291278
/// while preserving the map's allocated memory for reuse.
292-
#[inline]
293279
pub fn par_drain(&mut self) -> ParDrain<'_, K, V, S> {
294280
ParDrain { map: self }
295281
}
@@ -317,7 +303,6 @@ impl<K: Send, V: Send, S: Send> IntoParallelIterator for HashMap<K, V, S> {
317303
type Item = (K, V);
318304
type Iter = IntoParIter<K, V, S>;
319305

320-
#[inline]
321306
fn into_par_iter(self) -> Self::Iter {
322307
IntoParIter { map: self }
323308
}
@@ -327,7 +312,6 @@ impl<'a, K: Sync, V: Sync, S: Sync> IntoParallelIterator for &'a HashMap<K, V, S
327312
type Item = (&'a K, &'a V);
328313
type Iter = ParIter<'a, K, V, S>;
329314

330-
#[inline]
331315
fn into_par_iter(self) -> Self::Iter {
332316
ParIter { map: self }
333317
}
@@ -337,7 +321,6 @@ impl<'a, K: Send + Sync, V: Send, S: Send> IntoParallelIterator for &'a mut Hash
337321
type Item = (&'a K, &'a mut V);
338322
type Iter = ParIterMut<'a, K, V, S>;
339323

340-
#[inline]
341324
fn into_par_iter(self) -> Self::Iter {
342325
ParIterMut { map: self }
343326
}

src/external_trait_impls/rayon/raw.rs

-11
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,6 @@ pub struct RawParIter<T> {
1818
impl<T> ParallelIterator for RawParIter<T> {
1919
type Item = Bucket<T>;
2020

21-
#[inline]
2221
fn drive_unindexed<C>(self, consumer: C) -> C::Result
2322
where
2423
C: UnindexedConsumer<Self::Item>,
@@ -36,15 +35,13 @@ struct ParIterProducer<T> {
3635
impl<T> UnindexedProducer for ParIterProducer<T> {
3736
type Item = Bucket<T>;
3837

39-
#[inline]
4038
fn split(self) -> (Self, Option<Self>) {
4139
let (left, right) = self.iter.split();
4240
let left = ParIterProducer { iter: left };
4341
let right = right.map(|right| ParIterProducer { iter: right });
4442
(left, right)
4543
}
4644

47-
#[inline]
4845
fn fold_with<F>(self, folder: F) -> F
4946
where
5047
F: Folder<Self::Item>,
@@ -61,7 +58,6 @@ pub struct RawIntoParIter<T> {
6158
impl<T: Send> ParallelIterator for RawIntoParIter<T> {
6259
type Item = T;
6360

64-
#[inline]
6561
fn drive_unindexed<C>(self, consumer: C) -> C::Result
6662
where
6763
C: UnindexedConsumer<Self::Item>,
@@ -92,7 +88,6 @@ unsafe impl<T> Send for RawParDrain<'_, T> {}
9288
impl<T: Send> ParallelIterator for RawParDrain<'_, T> {
9389
type Item = T;
9490

95-
#[inline]
9691
fn drive_unindexed<C>(self, consumer: C) -> C::Result
9792
where
9893
C: UnindexedConsumer<Self::Item>,
@@ -123,7 +118,6 @@ struct ParDrainProducer<T> {
123118
impl<T: Send> UnindexedProducer for ParDrainProducer<T> {
124119
type Item = T;
125120

126-
#[inline]
127121
fn split(self) -> (Self, Option<Self>) {
128122
let (left, right) = self.iter.clone().split();
129123
mem::forget(self);
@@ -132,7 +126,6 @@ impl<T: Send> UnindexedProducer for ParDrainProducer<T> {
132126
(left, right)
133127
}
134128

135-
#[inline]
136129
fn fold_with<F>(mut self, mut folder: F) -> F
137130
where
138131
F: Folder<Self::Item>,
@@ -153,7 +146,6 @@ impl<T: Send> UnindexedProducer for ParDrainProducer<T> {
153146
}
154147

155148
impl<T> Drop for ParDrainProducer<T> {
156-
#[inline]
157149
fn drop(&mut self) {
158150
// Drop all remaining elements
159151
if mem::needs_drop::<T>() {
@@ -168,22 +160,19 @@ impl<T> Drop for ParDrainProducer<T> {
168160

169161
impl<T> RawTable<T> {
170162
/// Returns a parallel iterator over the elements in a `RawTable`.
171-
#[inline]
172163
pub fn par_iter(&self) -> RawParIter<T> {
173164
RawParIter {
174165
iter: unsafe { self.iter().iter },
175166
}
176167
}
177168

178169
/// Returns a parallel iterator over the elements in a `RawTable`.
179-
#[inline]
180170
pub fn into_par_iter(self) -> RawIntoParIter<T> {
181171
RawIntoParIter { table: self }
182172
}
183173

184174
/// Returns a parallel iterator which consumes all elements of a `RawTable`
185175
/// without freeing its memory allocation.
186-
#[inline]
187176
pub fn par_drain(&mut self) -> RawParDrain<'_, T> {
188177
RawParDrain {
189178
table: NonNull::from(self),

src/external_trait_impls/rayon/set.rs

-7
Original file line numberDiff line numberDiff line change
@@ -214,14 +214,12 @@ where
214214
{
215215
/// Visits (potentially in parallel) the values representing the difference,
216216
/// i.e. the values that are in `self` but not in `other`.
217-
#[inline]
218217
pub fn par_difference<'a>(&'a self, other: &'a Self) -> ParDifference<'a, T, S> {
219218
ParDifference { a: self, b: other }
220219
}
221220

222221
/// Visits (potentially in parallel) the values representing the symmetric
223222
/// difference, i.e. the values that are in `self` or in `other` but not in both.
224-
#[inline]
225223
pub fn par_symmetric_difference<'a>(
226224
&'a self,
227225
other: &'a Self,
@@ -231,14 +229,12 @@ where
231229

232230
/// Visits (potentially in parallel) the values representing the
233231
/// intersection, i.e. the values that are both in `self` and `other`.
234-
#[inline]
235232
pub fn par_intersection<'a>(&'a self, other: &'a Self) -> ParIntersection<'a, T, S> {
236233
ParIntersection { a: self, b: other }
237234
}
238235

239236
/// Visits (potentially in parallel) the values representing the union,
240237
/// i.e. all the values in `self` or `other`, without duplicates.
241-
#[inline]
242238
pub fn par_union<'a>(&'a self, other: &'a Self) -> ParUnion<'a, T, S> {
243239
ParUnion { a: self, b: other }
244240
}
@@ -287,7 +283,6 @@ where
287283
{
288284
/// Consumes (potentially in parallel) all values in an arbitrary order,
289285
/// while preserving the set's allocated memory for reuse.
290-
#[inline]
291286
pub fn par_drain(&mut self) -> ParDrain<'_, T, S> {
292287
ParDrain { set: self }
293288
}
@@ -297,7 +292,6 @@ impl<T: Send, S: Send> IntoParallelIterator for HashSet<T, S> {
297292
type Item = T;
298293
type Iter = IntoParIter<T, S>;
299294

300-
#[inline]
301295
fn into_par_iter(self) -> Self::Iter {
302296
IntoParIter { set: self }
303297
}
@@ -307,7 +301,6 @@ impl<'a, T: Sync, S: Sync> IntoParallelIterator for &'a HashSet<T, S> {
307301
type Item = &'a T;
308302
type Iter = ParIter<'a, T, S>;
309303

310-
#[inline]
311304
fn into_par_iter(self) -> Self::Iter {
312305
ParIter { set: self }
313306
}

src/external_trait_impls/serde.rs

-6
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,6 @@ mod size_hint {
44
/// This presumably exists to prevent denial of service attacks.
55
///
66
/// Original discussion: https://github.com/serde-rs/serde/issues/1114.
7-
#[inline]
87
pub(super) fn cautious(hint: Option<usize>) -> usize {
98
cmp::min(hint.unwrap_or(0), 4096)
109
}
@@ -27,7 +26,6 @@ mod map {
2726
V: Serialize,
2827
H: BuildHasher,
2928
{
30-
#[inline]
3129
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
3230
where
3331
S: Serializer,
@@ -62,7 +60,6 @@ mod map {
6260
formatter.write_str("a map")
6361
}
6462

65-
#[inline]
6663
fn visit_map<A>(self, mut map: A) -> Result<Self::Value, A::Error>
6764
where
6865
A: MapAccess<'de>,
@@ -104,7 +101,6 @@ mod set {
104101
T: Serialize + Eq + Hash,
105102
H: BuildHasher,
106103
{
107-
#[inline]
108104
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
109105
where
110106
S: Serializer,
@@ -137,7 +133,6 @@ mod set {
137133
formatter.write_str("a sequence")
138134
}
139135

140-
#[inline]
141136
fn visit_seq<A>(self, mut seq: A) -> Result<Self::Value, A::Error>
142137
where
143138
A: SeqAccess<'de>,
@@ -178,7 +173,6 @@ mod set {
178173
formatter.write_str("a sequence")
179174
}
180175

181-
#[inline]
182176
fn visit_seq<A>(self, mut seq: A) -> Result<Self::Value, A::Error>
183177
where
184178
A: SeqAccess<'de>,

0 commit comments

Comments
 (0)