feat(span_concentrator): replace std map with hashbrown for better perf #1234

paullegranddc · 2025-09-19T16:50:54Z

Motivations

Creating and checking for the existence of the aggregation key is the most expensive operation performed by data-pipeline if there is a high drop rate.

Currently this operation is not optimal as equality has to go through virtual dispatch because of limitations on the std hashmap.

Changing to hashbrown gives us:

A more efficient hashing function
We can implement the Equivalent trait instead of the borrow trait to compare owned and borrowed key, which performs better
We can drop the Cow and just use either String or &str
We can use the entry API on the hashmap so we don't have to potentially access the hashmap twice

These changes bring a 25% perf improvement on the span concentrators benchmarks locally (only 10% in CI??)

Changes

Use hashbrown::HashMap instead of std::HashMap in the stats collector
Replace AggregationKey<'static> with OwnedAggregationKey
implement hashbrown::Equivalent instead of Borrow

…rformances # Motivations Creating and checking for the existence of the aggregation key is the most expensive operation performed by data-pipeline if there is a high drop rate. Currently this operation is not optimal as equality has to go through virtual dispatch because of limitations on the std hashmap. Changing to hashbrown gives us: * A more efficient hashing function * We can implement the Equivalent trait instead of the borrow trait to compare owned and borrowed key, which performs better * We can drop the Cow and just use either String or &str * We can use the entry API on the hashmap so we don't have to potentially access the hashmap twice These changes bring a 25% perf improvement on the span concentrators benchmarks # Changes * Use hashbrown::HashMap instead of std::HashMap in the stats collector * Replace AggregationKey<'static> with OwnedAggregationKey * implement hashbrown::Equivalent instead of Borrow

pr-commenter · 2025-09-19T17:02:05Z

Benchmarks

Comparison

Benchmark execution time: 2025-09-25 16:00:21

Comparing candidate commit dc2c080 in PR branch paullgdc/data-pipeline/stats_collector_hmap with baseline commit f61c42a in branch main.

Found 1 performance improvements and 0 performance regressions! Performance is the same for 52 metrics, 2 unstable metrics.

scenario:concentrator/add_spans_to_concentrator

🟩 execution_time [-1.209ms; -1.202ms] or [-10.298%; -10.245%]

Candidate

Candidate benchmark details

Group 1

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`dc2c080`	1758815247	paullgdc/data-pipeline/stats_collector_hmap

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
benching deserializing traces from msgpack to their internal representation	execution_time	60.380ms	60.832ms ± 2.225ms	60.546ms ± 0.056ms	60.601ms	60.793ms	68.840ms	83.143ms	37.32%	8.995	82.192	3.65%	0.157ms	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
benching deserializing traces from msgpack to their internal representation	execution_time	[60.523ms; 61.140ms] or [-0.507%; +0.507%]	None	None	None

Group 2

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`dc2c080`	1758815247	paullgdc/data-pipeline/stats_collector_hmap

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
write only interface	execution_time	1.192µs	3.234µs ± 1.412µs	3.026µs ± 0.024µs	3.048µs	3.685µs	13.879µs	14.782µs	388.50%	7.317	54.826	43.54%	0.100µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
write only interface	execution_time	[3.039µs; 3.430µs] or [-6.049%; +6.049%]	None	None	None

Group 3

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`dc2c080`	1758815247	paullgdc/data-pipeline/stats_collector_hmap

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
tags/replace_trace_tags	execution_time	2.337µs	2.393µs ± 0.015µs	2.393µs ± 0.005µs	2.398µs	2.417µs	2.425µs	2.441µs	2.00%	-1.021	3.383	0.63%	0.001µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
tags/replace_trace_tags	execution_time	[2.391µs; 2.395µs] or [-0.087%; +0.087%]	None	None	None

Group 4

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`dc2c080`	1758815247	paullgdc/data-pipeline/stats_collector_hmap

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
benching serializing traces from their internal representation to msgpack	execution_time	14.368ms	14.434ms ± 0.037ms	14.428ms ± 0.020ms	14.451ms	14.495ms	14.557ms	14.623ms	1.35%	1.432	4.311	0.25%	0.003ms	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
benching serializing traces from their internal representation to msgpack	execution_time	[14.429ms; 14.439ms] or [-0.035%; +0.035%]	None	None	None

Group 5

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`dc2c080`	1758815247	paullgdc/data-pipeline/stats_collector_hmap

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
receiver_entry_point/report/2597	execution_time	6.176ms	6.236ms ± 0.041ms	6.225ms ± 0.017ms	6.247ms	6.329ms	6.380ms	6.429ms	3.28%	1.913	4.449	0.66%	0.003ms	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
receiver_entry_point/report/2597	execution_time	[6.230ms; 6.241ms] or [-0.091%; +0.091%]	None	None	None

Group 6

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`dc2c080`	1758815247	paullgdc/data-pipeline/stats_collector_hmap

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
concentrator/add_spans_to_concentrator	execution_time	10.500ms	10.531ms ± 0.016ms	10.529ms ± 0.009ms	10.539ms	10.560ms	10.574ms	10.618ms	0.85%	1.204	3.871	0.15%	0.001ms	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
concentrator/add_spans_to_concentrator	execution_time	[10.529ms; 10.533ms] or [-0.021%; +0.021%]	None	None	None

Group 7

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`dc2c080`	1758815247	paullgdc/data-pipeline/stats_collector_hmap

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
two way interface	execution_time	17.934µs	26.383µs ± 10.079µs	18.197µs ± 0.153µs	35.062µs	43.608µs	48.484µs	71.559µs	293.25%	0.892	0.567	38.11%	0.713µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
two way interface	execution_time	[24.986µs; 27.780µs] or [-5.295%; +5.295%]	None	None	None

Group 8

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`dc2c080`	1758815247	paullgdc/data-pipeline/stats_collector_hmap

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
credit_card/is_card_number/	execution_time	3.894µs	3.912µs ± 0.003µs	3.912µs ± 0.002µs	3.914µs	3.916µs	3.917µs	3.923µs	0.29%	-0.913	9.542	0.07%	0.000µs	1	200
credit_card/is_card_number/	throughput	254896896.349op/s	255632034.475op/s ± 179428.019op/s	255642953.275op/s ± 113531.276op/s	255754219.896op/s	255846630.822op/s	255901902.286op/s	256832941.471op/s	0.47%	0.936	9.676	0.07%	12687.477op/s	1	200
credit_card/is_card_number/ 3782-8224-6310-005	execution_time	76.624µs	78.407µs ± 0.748µs	78.423µs ± 0.499µs	78.871µs	79.587µs	80.223µs	80.484µs	2.63%	0.065	-0.189	0.95%	0.053µs	1	200
credit_card/is_card_number/ 3782-8224-6310-005	throughput	12424879.229op/s	12755164.242op/s ± 121615.078op/s	12751401.348op/s ± 81217.349op/s	12840800.725op/s	12971841.072op/s	13011077.660op/s	13050733.298op/s	2.35%	-0.014	-0.218	0.95%	8599.485op/s	1	200
credit_card/is_card_number/ 378282246310005	execution_time	69.918µs	70.948µs ± 0.574µs	70.873µs ± 0.346µs	71.287µs	71.937µs	72.495µs	72.636µs	2.49%	0.539	-0.051	0.81%	0.041µs	1	200
credit_card/is_card_number/ 378282246310005	throughput	13767354.556op/s	14095803.182op/s ± 113596.036op/s	14109723.756op/s ± 69216.840op/s	14174161.341op/s	14279920.265op/s	14301717.270op/s	14302501.974op/s	1.37%	-0.499	-0.105	0.80%	8032.453op/s	1	200
credit_card/is_card_number/37828224631	execution_time	3.893µs	3.911µs ± 0.002µs	3.911µs ± 0.001µs	3.912µs	3.915µs	3.917µs	3.917µs	0.17%	-1.647	16.133	0.06%	0.000µs	1	200
credit_card/is_card_number/37828224631	throughput	255275162.356op/s	255693481.023op/s ± 153368.779op/s	255714278.717op/s ± 80420.238op/s	255781310.583op/s	255865305.457op/s	255918060.609op/s	256868416.239op/s	0.45%	1.675	16.361	0.06%	10844.810op/s	1	200
credit_card/is_card_number/378282246310005	execution_time	66.719µs	67.970µs ± 0.644µs	67.912µs ± 0.441µs	68.367µs	69.067µs	69.447µs	70.038µs	3.13%	0.365	-0.197	0.95%	0.046µs	1	200
credit_card/is_card_number/378282246310005	throughput	14277946.862op/s	14713674.559op/s ± 139040.788op/s	14725032.669op/s ± 95073.698op/s	14812040.056op/s	14936253.330op/s	14976449.141op/s	14988290.398op/s	1.79%	-0.318	-0.261	0.94%	9831.668op/s	1	200
credit_card/is_card_number/37828224631000521389798	execution_time	52.143µs	52.221µs ± 0.030µs	52.225µs ± 0.016µs	52.238µs	52.266µs	52.298µs	52.323µs	0.19%	0.174	0.613	0.06%	0.002µs	1	200
credit_card/is_card_number/37828224631000521389798	throughput	19112156.705op/s	19149492.707op/s ± 11072.381op/s	19147967.314op/s ± 5873.928op/s	19156175.937op/s	19168336.879op/s	19172751.706op/s	19178182.899op/s	0.16%	-0.169	0.606	0.06%	782.936op/s	1	200
credit_card/is_card_number/x371413321323331	execution_time	6.026µs	6.039µs ± 0.016µs	6.035µs ± 0.003µs	6.038µs	6.065µs	6.114µs	6.166µs	2.16%	4.882	28.283	0.27%	0.001µs	1	200
credit_card/is_card_number/x371413321323331	throughput	162181091.975op/s	165604086.035op/s ± 434719.734op/s	165691234.934op/s ± 89500.287op/s	165784143.797op/s	165861988.282op/s	165910789.825op/s	165946347.353op/s	0.15%	-4.831	27.693	0.26%	30739.327op/s	1	200
credit_card/is_card_number_no_luhn/	execution_time	3.892µs	3.911µs ± 0.003µs	3.911µs ± 0.002µs	3.913µs	3.917µs	3.919µs	3.919µs	0.19%	-0.944	8.225	0.08%	0.000µs	1	200
credit_card/is_card_number_no_luhn/	throughput	255177040.174op/s	255660537.948op/s ± 197790.133op/s	255669485.858op/s ± 125839.121op/s	255794362.545op/s	255898990.894op/s	255954997.749op/s	256965219.652op/s	0.51%	0.966	8.374	0.08%	13985.874op/s	1	200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	execution_time	64.106µs	64.404µs ± 0.153µs	64.379µs ± 0.101µs	64.509µs	64.685µs	64.814µs	64.845µs	0.72%	0.577	0.049	0.24%	0.011µs	1	200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	throughput	15421497.706op/s	15527142.413op/s ± 36775.170op/s	15533101.266op/s ± 24295.591op/s	15552606.176op/s	15581235.675op/s	15595812.422op/s	15599263.715op/s	0.43%	-0.564	0.031	0.24%	2600.397op/s	1	200
credit_card/is_card_number_no_luhn/ 378282246310005	execution_time	57.779µs	57.987µs ± 0.144µs	57.948µs ± 0.065µs	58.054µs	58.256µs	58.435µs	58.665µs	1.24%	1.701	4.053	0.25%	0.010µs	1	200
credit_card/is_card_number_no_luhn/ 378282246310005	throughput	17045843.071op/s	17245385.207op/s ± 42778.997op/s	17256857.002op/s ± 19498.177op/s	17272523.739op/s	17294725.973op/s	17300613.191op/s	17307244.813op/s	0.29%	-1.678	3.929	0.25%	3024.932op/s	1	200
credit_card/is_card_number_no_luhn/37828224631	execution_time	3.893µs	3.913µs ± 0.003µs	3.913µs ± 0.002µs	3.914µs	3.917µs	3.919µs	3.922µs	0.24%	-1.124	8.481	0.07%	0.000µs	1	200
credit_card/is_card_number_no_luhn/37828224631	throughput	254977038.106op/s	255588473.129op/s ± 189233.994op/s	255587081.627op/s ± 110862.712op/s	255705446.536op/s	255826897.834op/s	255912377.591op/s	256841093.611op/s	0.49%	1.145	8.619	0.07%	13380.864op/s	1	200
credit_card/is_card_number_no_luhn/378282246310005	execution_time	54.576µs	54.854µs ± 0.221µs	54.786µs ± 0.120µs	54.968µs	55.305µs	55.446µs	55.632µs	1.55%	1.162	0.791	0.40%	0.016µs	1	200
credit_card/is_card_number_no_luhn/378282246310005	throughput	17975241.315op/s	18230615.668op/s ± 73245.247op/s	18253003.082op/s ± 40215.889op/s	18287990.734op/s	18306133.654op/s	18318305.851op/s	18323043.059op/s	0.38%	-1.144	0.735	0.40%	5179.221op/s	1	200
credit_card/is_card_number_no_luhn/37828224631000521389798	execution_time	52.143µs	52.203µs ± 0.036µs	52.197µs ± 0.017µs	52.215µs	52.275µs	52.310µs	52.376µs	0.34%	1.523	3.275	0.07%	0.003µs	1	200
credit_card/is_card_number_no_luhn/37828224631000521389798	throughput	19092706.626op/s	19155871.578op/s ± 13235.196op/s	19158061.089op/s ± 6392.457op/s	19164291.655op/s	19171815.514op/s	19174461.845op/s	19178037.312op/s	0.10%	-1.517	3.246	0.07%	935.870op/s	1	200
credit_card/is_card_number_no_luhn/x371413321323331	execution_time	6.027µs	6.036µs ± 0.010µs	6.034µs ± 0.002µs	6.036µs	6.051µs	6.073µs	6.108µs	1.23%	4.028	19.968	0.16%	0.001µs	1	200
credit_card/is_card_number_no_luhn/x371413321323331	throughput	163723189.834op/s	165679025.728op/s ± 264144.097op/s	165738542.499op/s ± 68532.351op/s	165798820.242op/s	165866290.088op/s	165908853.041op/s	165930626.978op/s	0.12%	-4.001	19.672	0.16%	18677.808op/s	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
credit_card/is_card_number/	execution_time	[3.911µs; 3.912µs] or [-0.010%; +0.010%]	None	None	None
credit_card/is_card_number/	throughput	[255607167.477op/s; 255656901.473op/s] or [-0.010%; +0.010%]	None	None	None
credit_card/is_card_number/ 3782-8224-6310-005	execution_time	[78.303µs; 78.510µs] or [-0.132%; +0.132%]	None	None	None
credit_card/is_card_number/ 3782-8224-6310-005	throughput	[12738309.562op/s; 12772018.923op/s] or [-0.132%; +0.132%]	None	None	None
credit_card/is_card_number/ 378282246310005	execution_time	[70.868µs; 71.027µs] or [-0.112%; +0.112%]	None	None	None
credit_card/is_card_number/ 378282246310005	throughput	[14080059.864op/s; 14111546.500op/s] or [-0.112%; +0.112%]	None	None	None
credit_card/is_card_number/37828224631	execution_time	[3.911µs; 3.911µs] or [-0.008%; +0.008%]	None	None	None
credit_card/is_card_number/37828224631	throughput	[255672225.586op/s; 255714736.461op/s] or [-0.008%; +0.008%]	None	None	None
credit_card/is_card_number/378282246310005	execution_time	[67.881µs; 68.059µs] or [-0.131%; +0.131%]	None	None	None
credit_card/is_card_number/378282246310005	throughput	[14694404.843op/s; 14732944.275op/s] or [-0.131%; +0.131%]	None	None	None
credit_card/is_card_number/37828224631000521389798	execution_time	[52.217µs; 52.225µs] or [-0.008%; +0.008%]	None	None	None
credit_card/is_card_number/37828224631000521389798	throughput	[19147958.181op/s; 19151027.232op/s] or [-0.008%; +0.008%]	None	None	None
credit_card/is_card_number/x371413321323331	execution_time	[6.036µs; 6.041µs] or [-0.037%; +0.037%]	None	None	None
credit_card/is_card_number/x371413321323331	throughput	[165543838.061op/s; 165664334.009op/s] or [-0.036%; +0.036%]	None	None	None
credit_card/is_card_number_no_luhn/	execution_time	[3.911µs; 3.912µs] or [-0.011%; +0.011%]	None	None	None
credit_card/is_card_number_no_luhn/	throughput	[255633126.138op/s; 255687949.758op/s] or [-0.011%; +0.011%]	None	None	None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	execution_time	[64.383µs; 64.425µs] or [-0.033%; +0.033%]	None	None	None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	throughput	[15522045.728op/s; 15532239.098op/s] or [-0.033%; +0.033%]	None	None	None
credit_card/is_card_number_no_luhn/ 378282246310005	execution_time	[57.967µs; 58.007µs] or [-0.035%; +0.035%]	None	None	None
credit_card/is_card_number_no_luhn/ 378282246310005	throughput	[17239456.449op/s; 17251313.964op/s] or [-0.034%; +0.034%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631	execution_time	[3.912µs; 3.913µs] or [-0.010%; +0.010%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631	throughput	[255562247.117op/s; 255614699.140op/s] or [-0.010%; +0.010%]	None	None	None
credit_card/is_card_number_no_luhn/378282246310005	execution_time	[54.823µs; 54.884µs] or [-0.056%; +0.056%]	None	None	None
credit_card/is_card_number_no_luhn/378282246310005	throughput	[18220464.581op/s; 18240766.754op/s] or [-0.056%; +0.056%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631000521389798	execution_time	[52.198µs; 52.208µs] or [-0.010%; +0.010%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631000521389798	throughput	[19154037.307op/s; 19157705.849op/s] or [-0.010%; +0.010%]	None	None	None
credit_card/is_card_number_no_luhn/x371413321323331	execution_time	[6.034µs; 6.037µs] or [-0.022%; +0.022%]	None	None	None
credit_card/is_card_number_no_luhn/x371413321323331	throughput	[165642417.897op/s; 165715633.560op/s] or [-0.022%; +0.022%]	None	None	None

Group 9

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`dc2c080`	1758815247	paullgdc/data-pipeline/stats_collector_hmap

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
normalization/normalize_trace/test_trace	execution_time	241.161ns	253.062ns ± 13.012ns	246.562ns ± 4.684ns	255.621ns	284.045ns	286.692ns	287.075ns	16.43%	1.343	0.615	5.13%	0.920ns	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
normalization/normalize_trace/test_trace	execution_time	[251.259ns; 254.866ns] or [-0.713%; +0.713%]	None	None	None

Group 10

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`dc2c080`	1758815247	paullgdc/data-pipeline/stats_collector_hmap

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	execution_time	186.468µs	187.106µs ± 0.692µs	186.822µs ± 0.166µs	187.101µs	188.311µs	189.381µs	190.325µs	1.88%	1.985	4.054	0.37%	0.049µs	1	200
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	throughput	5254166.357op/s	5344639.212op/s ± 19634.319op/s	5352691.879op/s ± 4761.617op/s	5356468.636op/s	5360047.560op/s	5361757.879op/s	5362838.239op/s	0.19%	-1.962	3.914	0.37%	1388.356op/s	1	200
normalization/normalize_name/normalize_name/bad-name	execution_time	17.572µs	17.655µs ± 0.039µs	17.652µs ± 0.025µs	17.678µs	17.728µs	17.755µs	17.792µs	0.79%	0.506	0.329	0.22%	0.003µs	1	200
normalization/normalize_name/normalize_name/bad-name	throughput	56204693.748op/s	56641371.636op/s ± 124114.371op/s	56651285.891op/s ± 78685.128op/s	56728797.701op/s	56834866.316op/s	56879698.349op/s	56908772.392op/s	0.45%	-0.493	0.306	0.22%	8776.211op/s	1	200
normalization/normalize_name/normalize_name/good	execution_time	9.943µs	10.044µs ± 0.043µs	10.045µs ± 0.027µs	10.064µs	10.109µs	10.146µs	10.305µs	2.59%	1.263	5.786	0.43%	0.003µs	1	200
normalization/normalize_name/normalize_name/good	throughput	97039986.356op/s	99562777.680op/s ± 425471.113op/s	99554343.498op/s ± 263408.647op/s	99850564.589op/s	100172361.415op/s	100271457.310op/s	100577865.555op/s	1.03%	-1.186	5.295	0.43%	30085.351op/s	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	execution_time	[187.010µs; 187.202µs] or [-0.051%; +0.051%]	None	None	None
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	throughput	[5341918.085op/s; 5347360.340op/s] or [-0.051%; +0.051%]	None	None	None
normalization/normalize_name/normalize_name/bad-name	execution_time	[17.650µs; 17.660µs] or [-0.030%; +0.030%]	None	None	None
normalization/normalize_name/normalize_name/bad-name	throughput	[56624170.578op/s; 56658572.694op/s] or [-0.030%; +0.030%]	None	None	None
normalization/normalize_name/normalize_name/good	execution_time	[10.038µs; 10.050µs] or [-0.060%; +0.060%]	None	None	None
normalization/normalize_name/normalize_name/good	throughput	[99503811.476op/s; 99621743.884op/s] or [-0.059%; +0.059%]	None	None	None

Group 11

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`dc2c080`	1758815247	paullgdc/data-pipeline/stats_collector_hmap

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	execution_time	534.168µs	535.113µs ± 0.465µs	535.052µs ± 0.298µs	535.361µs	536.056µs	536.308µs	536.950µs	0.35%	0.857	0.887	0.09%	0.033µs	1	200
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	throughput	1862371.410op/s	1868766.630op/s ± 1622.577op/s	1868976.810op/s ± 1041.208op/s	1869980.117op/s	1870936.299op/s	1871428.350op/s	1872068.973op/s	0.17%	-0.851	0.872	0.09%	114.734op/s	1	200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	execution_time	379.760µs	380.556µs ± 0.480µs	380.464µs ± 0.260µs	380.786µs	381.159µs	381.587µs	384.699µs	1.11%	3.933	28.996	0.13%	0.034µs	1	200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	throughput	2599433.414op/s	2627740.343op/s ± 3296.630op/s	2628371.222op/s ± 1797.190op/s	2629592.462op/s	2631209.238op/s	2632446.287op/s	2633239.636op/s	0.19%	-3.875	28.351	0.13%	233.107op/s	1	200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	execution_time	194.434µs	194.904µs ± 0.208µs	194.901µs ± 0.164µs	195.067µs	195.239µs	195.411µs	195.434µs	0.27%	0.116	-0.550	0.11%	0.015µs	1	200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	throughput	5116812.108op/s	5130748.755op/s ± 5464.893op/s	5130816.929op/s ± 4319.941op/s	5135060.711op/s	5139379.534op/s	5140878.985op/s	5143124.353op/s	0.24%	-0.112	-0.552	0.11%	386.426op/s	1	200
normalization/normalize_service/normalize_service/[empty string]	execution_time	36.415µs	36.558µs ± 0.050µs	36.558µs ± 0.031µs	36.588µs	36.632µs	36.680µs	36.764µs	0.56%	0.234	1.197	0.14%	0.004µs	1	200
normalization/normalize_service/normalize_service/[empty string]	throughput	27200241.613op/s	27353886.153op/s ± 37362.218op/s	27353434.913op/s ± 23395.936op/s	27377752.926op/s	27418353.113op/s	27439355.214op/s	27461232.709op/s	0.39%	-0.221	1.174	0.14%	2641.908op/s	1	200
normalization/normalize_service/normalize_service/test_ASCII	execution_time	44.898µs	45.093µs ± 0.145µs	45.088µs ± 0.131µs	45.210µs	45.337µs	45.389µs	45.418µs	0.73%	0.229	-1.099	0.32%	0.010µs	1	200
normalization/normalize_service/normalize_service/test_ASCII	throughput	22017793.802op/s	22176544.599op/s ± 71398.130op/s	22179038.821op/s ± 64345.664op/s	22244813.078op/s	22270178.440op/s	22271787.398op/s	22272662.929op/s	0.42%	-0.221	-1.107	0.32%	5048.610op/s	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	execution_time	[535.048µs; 535.177µs] or [-0.012%; +0.012%]	None	None	None
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	throughput	[1868541.757op/s; 1868991.504op/s] or [-0.012%; +0.012%]	None	None	None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	execution_time	[380.489µs; 380.622µs] or [-0.017%; +0.017%]	None	None	None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	throughput	[2627283.462op/s; 2628197.224op/s] or [-0.017%; +0.017%]	None	None	None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	execution_time	[194.875µs; 194.932µs] or [-0.015%; +0.015%]	None	None	None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	throughput	[5129991.373op/s; 5131506.137op/s] or [-0.015%; +0.015%]	None	None	None
normalization/normalize_service/normalize_service/[empty string]	execution_time	[36.551µs; 36.565µs] or [-0.019%; +0.019%]	None	None	None
normalization/normalize_service/normalize_service/[empty string]	throughput	[27348708.109op/s; 27359064.197op/s] or [-0.019%; +0.019%]	None	None	None
normalization/normalize_service/normalize_service/test_ASCII	execution_time	[45.073µs; 45.113µs] or [-0.045%; +0.045%]	None	None	None
normalization/normalize_service/normalize_service/test_ASCII	throughput	[22166649.505op/s; 22186439.694op/s] or [-0.045%; +0.045%]	None	None	None

Group 12

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`dc2c080`	1758815247	paullgdc/data-pipeline/stats_collector_hmap

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
sql/obfuscate_sql_string	execution_time	87.016µs	87.270µs ± 0.129µs	87.264µs ± 0.055µs	87.315µs	87.404µs	87.565µs	88.491µs	1.41%	5.000	42.121	0.15%	0.009µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
sql/obfuscate_sql_string	execution_time	[87.252µs; 87.288µs] or [-0.021%; +0.021%]	None	None	None

Group 13

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`dc2c080`	1758815247	paullgdc/data-pipeline/stats_collector_hmap

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
redis/obfuscate_redis_string	execution_time	33.567µs	34.055µs ± 0.902µs	33.640µs ± 0.043µs	33.738µs	35.965µs	36.015µs	37.764µs	12.26%	1.816	1.718	2.64%	0.064µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
redis/obfuscate_redis_string	execution_time	[33.930µs; 34.180µs] or [-0.367%; +0.367%]	None	None	None

Group 14

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`dc2c080`	1758815247	paullgdc/data-pipeline/stats_collector_hmap

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
benching string interning on wordpress profile	execution_time	156.176µs	156.831µs ± 0.308µs	156.817µs ± 0.173µs	156.987µs	157.339µs	157.836µs	158.427µs	1.03%	1.244	4.267	0.20%	0.022µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
benching string interning on wordpress profile	execution_time	[156.788µs; 156.874µs] or [-0.027%; +0.027%]	None	None	None

Group 15

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`dc2c080`	1758815247	paullgdc/data-pipeline/stats_collector_hmap

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
ip_address/quantize_peer_ip_address_benchmark	execution_time	4.980µs	5.069µs ± 0.044µs	5.067µs ± 0.024µs	5.086µs	5.149µs	5.154µs	5.156µs	1.75%	0.087	-0.767	0.88%	0.003µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
ip_address/quantize_peer_ip_address_benchmark	execution_time	[5.063µs; 5.075µs] or [-0.122%; +0.122%]	None	None	None

Baseline

Omitted due to size.

codecov-commenter · 2025-09-19T17:05:44Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.60%. Comparing base (f61c42a) to head (dc2c080).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1234      +/-   ##
==========================================
- Coverage   71.65%   71.60%   -0.06%     
==========================================
  Files         355      355              
  Lines       56317    56309       -8     
==========================================
- Hits        40354    40320      -34     
- Misses      15963    15989      +26

Components	Coverage Δ
datadog-crashtracker	`49.30% <ø> (-0.03%)`	⬇️
datadog-crashtracker-ffi	`5.93% <ø> (ø)`
datadog-alloc	`98.73% <ø> (ø)`
data-pipeline	`90.45% <100.00%> (+0.03%)`	⬆️
data-pipeline-ffi	`88.19% <ø> (ø)`
ddcommon	`84.29% <ø> (ø)`
ddcommon-ffi	`73.84% <ø> (ø)`
ddtelemetry	`59.98% <ø> (ø)`
ddtelemetry-ffi	`21.24% <ø> (ø)`
dogstatsd-client	`83.26% <ø> (ø)`
datadog-ipc	`82.49% <ø> (ø)`
datadog-profiling	`76.90% <ø> (ø)`
datadog-profiling-ffi	`62.12% <ø> (ø)`
datadog-sidecar	`36.35% <ø> (-0.73%)`	⬇️
datdog-sidecar-ffi	`7.85% <ø> (-3.52%)`	⬇️
spawn-worker	`55.35% <ø> (ø)`
tinybytes	`92.22% <ø> (ø)`
datadog-trace-normalization	`98.24% <ø> (ø)`
datadog-trace-obfuscation	`94.17% <ø> (ø)`
datadog-trace-protobuf	`59.65% <ø> (ø)`
datadog-trace-utils	`89.77% <ø> (ø)`
datadog-tracer-flare	`54.52% <ø> (ø)`
datadog-log	`76.31% <ø> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

VianneyRuhlmann

LGTM

VianneyRuhlmann · 2025-09-23T14:57:41Z

data-pipeline/src/span_concentrator/aggregation.rs

-                .into_iter()
-                .map(|(key, value)| (Cow::from(key.into_owned()), Cow::from(value.into_owned())))
-                .collect(),
+            peer_tags,


very nit: this should be:

peer_tags, is_trace_root: span.parent_id == 0,

to match field order in the struct

paullegranddc · 2025-09-25T15:50:32Z

/merge

dd-devflow-routing-codex · 2025-09-25T15:50:40Z

View all feedbacks in Devflow UI.

2025-09-25 15:50:40 UTC ℹ️ Start processing command /merge

2025-09-25 15:50:47 UTC ℹ️ MergeQueue: waiting for PR to be ready

This merge request is not mergeable yet, because of pending checks/missing approvals. It will be added to the queue as soon as checks pass and/or get approvals.
Note: if you pushed new commits since the last approval, you may need additional approval.
You can remove it from the waiting list with /remove command.

2025-09-25 16:28:23 UTC ℹ️ MergeQueue: merge request added to the queue

The expected merge time in main is approximately 45m (p90).

2025-09-25 16:58:59 UTC ℹ️ MergeQueue: This merge request was merged

paullegranddc requested review from a team as code owners September 19, 2025 16:50

github-actions bot added the data-pipeline label Sep 19, 2025

VianneyRuhlmann approved these changes Sep 23, 2025

View reviewed changes

Merge branch 'main' into paullgdc/data-pipeline/stats_collector_hmap

dc2c080

paullegranddc force-pushed the paullgdc/data-pipeline/stats_collector_hmap branch from f96a538 to dc2c080 Compare September 25, 2025 15:47

dd-devflow bot added mergequeue-status: waiting mergequeue-status: queued mergequeue-status: in_progress and removed mergequeue-status: waiting mergequeue-status: queued labels Sep 25, 2025

dd-mergequeue bot merged commit 8719ad8 into main Sep 25, 2025
37 checks passed

dd-devflow bot removed the mergequeue-status: in_progress label Sep 25, 2025

dd-mergequeue bot deleted the paullgdc/data-pipeline/stats_collector_hmap branch September 25, 2025 16:58

dd-devflow bot added the mergequeue-status: done label Sep 25, 2025

feat(span_concentrator): replace std map with hashbrown for better perf #1234

feat(span_concentrator): replace std map with hashbrown for better perf #1234

Uh oh!

Conversation

paullegranddc commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivations

Changes

Uh oh!

pr-commenter bot commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Comparison

scenario:concentrator/add_spans_to_concentrator

Candidate

Group 1

Group 2

Group 3

Group 4

Group 5

Group 6

Group 7

Group 8

Group 9

Group 10

Group 11

Group 12

Group 13

Group 14

Group 15

Baseline

Uh oh!

codecov-commenter commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

VianneyRuhlmann left a comment

Choose a reason for hiding this comment

Uh oh!

VianneyRuhlmann Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

paullegranddc commented Sep 25, 2025

Uh oh!

dd-devflow-routing-codex bot commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

paullegranddc commented Sep 19, 2025 •

edited

Loading

pr-commenter bot commented Sep 19, 2025 •

edited

Loading

codecov-commenter commented Sep 19, 2025 •

edited

Loading

dd-devflow-routing-codex bot commented Sep 25, 2025 •

edited

Loading