[Data-Pipeline] [APMSP-1240] Add concentrator #570

VianneyRuhlmann · 2024-08-02T13:08:35Z

What does this PR do?

Implement the stats concentrator from the trace agent, to allow stats computation in the data pipeline.

Motivation

Is required to allow stats computation in libdatadog

pr-commenter · 2024-08-02T13:13:00Z

Benchmarks

Comparison

Benchmark execution time: 2024-08-22 13:04:11

Comparing candidate commit b80115d in PR branch vianney/data-pipeline/add-stats-bucket with baseline commit a063681 in branch main.

Found 0 performance improvements and 1 performance regressions! Performance is the same for 49 metrics, 2 unstable metrics.

scenario:benching deserializing traces from msgpack to their internal representation

🟥 execution_time [+106.827ns; +130.862ns] or [+7.888%; +9.663%]

Candidate

Candidate benchmark details

Group 1

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`b80115d`	1724331255	vianney/data-pipeline/add-stats-bucket

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
sql/obfuscate_sql_string	execution_time	73.599µs	74.004µs ± 0.244µs	74.040µs ± 0.111µs	74.122µs	74.180µs	74.579µs	76.203µs	2.92%	4.222	34.507	0.33%	0.017µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
sql/obfuscate_sql_string	execution_time	[73.970µs; 74.038µs] or [-0.046%; +0.046%]	None	None	None

Group 2

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`b80115d`	1724331255	vianney/data-pipeline/add-stats-bucket

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
credit_card/is_card_number/	execution_time	1.613µs	1.613µs ± 0.000µs	1.613µs ± 0.000µs	1.614µs	1.614µs	1.614µs	1.615µs	0.11%	0.667	0.427	0.03%	0.000µs	1	200
credit_card/is_card_number/	throughput	619115987.889op/s	619773819.814op/s ± 172032.673op/s	619797432.167op/s ± 132781.038op/s	619921683.995op/s	619994578.024op/s	620026254.716op/s	620058703.960op/s	0.04%	-0.665	0.421	0.03%	12164.547op/s	1	200
credit_card/is_card_number/ 3782-8224-6310-005	execution_time	104.132µs	104.961µs ± 0.399µs	104.921µs ± 0.173µs	105.090µs	105.395µs	106.256µs	108.129µs	3.06%	3.390	21.487	0.38%	0.028µs	1	200
credit_card/is_card_number/ 3782-8224-6310-005	throughput	9248223.474op/s	9527449.046op/s ± 35728.755op/s	9530936.452op/s ± 15733.523op/s	9546656.392op/s	9569683.390op/s	9591877.401op/s	9603201.707op/s	0.76%	-3.257	20.204	0.37%	2526.405op/s	1	200
credit_card/is_card_number/ 378282246310005	execution_time	96.120µs	97.420µs ± 0.638µs	97.255µs ± 0.217µs	97.511µs	98.771µs	99.569µs	100.671µs	3.51%	2.127	5.516	0.65%	0.045µs	1	200
credit_card/is_card_number/ 378282246310005	throughput	9933306.047op/s	10265306.580op/s ± 66289.493op/s	10282234.870op/s ± 22893.528op/s	10303053.326op/s	10330014.117op/s	10341100.622op/s	10403637.920op/s	1.18%	-2.069	5.210	0.64%	4687.375op/s	1	200
credit_card/is_card_number/37828224631	execution_time	1.613µs	1.614µs ± 0.003µs	1.614µs ± 0.000µs	1.614µs	1.615µs	1.615µs	1.656µs	2.62%	13.140	177.908	0.19%	0.000µs	1	200
credit_card/is_card_number/37828224631	throughput	603895800.206op/s	619610426.220op/s ± 1143629.041op/s	619687743.565op/s ± 149690.095op/s	619864295.425op/s	619968122.958op/s	620015229.096op/s	620100004.382op/s	0.07%	-13.099	177.119	0.18%	80866.785op/s	1	200
credit_card/is_card_number/378282246310005	execution_time	94.226µs	95.473µs ± 0.637µs	95.436µs ± 0.430µs	95.848µs	96.595µs	97.062µs	98.336µs	3.04%	0.775	1.332	0.67%	0.045µs	1	200
credit_card/is_card_number/378282246310005	throughput	10169190.143op/s	10474577.715op/s ± 69515.375op/s	10478171.376op/s ± 47356.557op/s	10527907.645op/s	10567863.400op/s	10594468.604op/s	10612769.163op/s	1.28%	-0.721	1.145	0.66%	4915.479op/s	1	200
credit_card/is_card_number/37828224631000521389798	execution_time	94.092µs	94.619µs ± 0.329µs	94.529µs ± 0.180µs	94.774µs	95.124µs	95.323µs	97.194µs	2.82%	2.807	17.371	0.35%	0.023µs	1	200
credit_card/is_card_number/37828224631000521389798	throughput	10288691.961op/s	10568869.414op/s ± 36390.209op/s	10578745.720op/s ± 20166.834op/s	10595974.325op/s	10604456.152op/s	10610562.956op/s	10627871.593op/s	0.46%	-2.691	16.157	0.34%	2573.176op/s	1	200
credit_card/is_card_number/x371413321323331	execution_time	22.437µs	22.801µs ± 0.236µs	22.758µs ± 0.151µs	22.972µs	23.225µs	23.391µs	23.544µs	3.45%	0.629	-0.209	1.03%	0.017µs	1	200
credit_card/is_card_number/x371413321323331	throughput	42474097.815op/s	43863304.646op/s ± 451167.780op/s	43940681.112op/s ± 294007.391op/s	44196684.647op/s	44482224.919op/s	44560201.542op/s	44570224.755op/s	1.43%	-0.586	-0.286	1.03%	31902.380op/s	1	200
credit_card/is_card_number_no_luhn/	execution_time	1.613µs	1.614µs ± 0.001µs	1.613µs ± 0.000µs	1.614µs	1.614µs	1.615µs	1.617µs	0.20%	1.758	6.238	0.03%	0.000µs	1	200
credit_card/is_card_number_no_luhn/	throughput	618511084.750op/s	619747102.386op/s ± 209659.925op/s	619777997.928op/s ± 140312.793op/s	619911909.602op/s	619979905.769op/s	619994572.288op/s	620025317.200op/s	0.04%	-1.753	6.201	0.03%	14825.195op/s	1	200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	execution_time	85.920µs	86.670µs ± 0.371µs	86.620µs ± 0.229µs	86.874µs	87.303µs	87.859µs	88.547µs	2.23%	1.053	2.908	0.43%	0.026µs	1	200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	throughput	11293430.787op/s	11538260.732op/s ± 49226.109op/s	11544719.270op/s ± 30481.909op/s	11570655.660op/s	11609433.617op/s	11620379.723op/s	11638758.229op/s	0.81%	-1.005	2.685	0.43%	3480.812op/s	1	200
credit_card/is_card_number_no_luhn/ 378282246310005	execution_time	79.196µs	80.191µs ± 0.587µs	80.130µs ± 0.398µs	80.604µs	81.092µs	81.372µs	83.564µs	4.29%	1.200	4.314	0.73%	0.042µs	1	200
credit_card/is_card_number_no_luhn/ 378282246310005	throughput	11966841.319op/s	12470934.508op/s ± 90558.210op/s	12479709.295op/s ± 62268.111op/s	12537550.592op/s	12594542.508op/s	12620404.298op/s	12626863.359op/s	1.18%	-1.098	3.672	0.72%	6403.432op/s	1	200
credit_card/is_card_number_no_luhn/37828224631	execution_time	1.613µs	1.614µs ± 0.000µs	1.613µs ± 0.000µs	1.614µs	1.614µs	1.615µs	1.615µs	0.11%	0.518	0.070	0.03%	0.000µs	1	200
credit_card/is_card_number_no_luhn/37828224631	throughput	619069786.957op/s	619755601.322op/s ± 181047.181op/s	619776697.366op/s ± 143560.306op/s	619910149.541op/s	619992652.032op/s	620090057.729op/s	620137444.954op/s	0.06%	-0.517	0.066	0.03%	12801.969op/s	1	200
credit_card/is_card_number_no_luhn/378282246310005	execution_time	77.070µs	78.334µs ± 0.473µs	78.275µs ± 0.297µs	78.611µs	79.190µs	79.526µs	80.294µs	2.58%	0.584	1.180	0.60%	0.033µs	1	200
credit_card/is_card_number_no_luhn/378282246310005	throughput	12454237.973op/s	12766339.431op/s ± 76818.441op/s	12775546.501op/s ± 48514.648op/s	12819920.559op/s	12866112.585op/s	12936445.848op/s	12975273.892op/s	1.56%	-0.533	1.079	0.60%	5431.884op/s	1	200
credit_card/is_card_number_no_luhn/37828224631000521389798	execution_time	94.096µs	94.693µs ± 0.360µs	94.667µs ± 0.256µs	94.924µs	95.315µs	95.526µs	95.865µs	1.27%	0.416	-0.281	0.38%	0.025µs	1	200
credit_card/is_card_number_no_luhn/37828224631000521389798	throughput	10431361.396op/s	10560627.141op/s ± 40123.155op/s	10563398.353op/s ± 28553.170op/s	10591845.868op/s	10622308.920op/s	10626684.498op/s	10627456.033op/s	0.61%	-0.398	-0.311	0.38%	2837.136op/s	1	200
credit_card/is_card_number_no_luhn/x371413321323331	execution_time	22.442µs	22.824µs ± 0.220µs	22.786µs ± 0.140µs	22.941µs	23.208µs	23.472µs	23.677µs	3.91%	0.779	0.707	0.96%	0.016µs	1	200
credit_card/is_card_number_no_luhn/x371413321323331	throughput	42235324.259op/s	43818057.362op/s ± 418925.207op/s	43886366.716op/s ± 271386.563op/s	44119031.136op/s	44405682.328op/s	44526083.447op/s	44559543.629op/s	1.53%	-0.719	0.546	0.95%	29622.485op/s	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
credit_card/is_card_number/	execution_time	[1.613µs; 1.614µs] or [-0.004%; +0.004%]	None	None	None
credit_card/is_card_number/	throughput	[619749977.740op/s; 619797661.888op/s] or [-0.004%; +0.004%]	None	None	None
credit_card/is_card_number/ 3782-8224-6310-005	execution_time	[104.906µs; 105.017µs] or [-0.053%; +0.053%]	None	None	None
credit_card/is_card_number/ 3782-8224-6310-005	throughput	[9522497.384op/s; 9532400.708op/s] or [-0.052%; +0.052%]	None	None	None
credit_card/is_card_number/ 378282246310005	execution_time	[97.331µs; 97.508µs] or [-0.091%; +0.091%]	None	None	None
credit_card/is_card_number/ 378282246310005	throughput	[10256119.494op/s; 10274493.666op/s] or [-0.089%; +0.089%]	None	None	None
credit_card/is_card_number/37828224631	execution_time	[1.613µs; 1.614µs] or [-0.026%; +0.026%]	None	None	None
credit_card/is_card_number/37828224631	throughput	[619451930.234op/s; 619768922.206op/s] or [-0.026%; +0.026%]	None	None	None
credit_card/is_card_number/378282246310005	execution_time	[95.385µs; 95.562µs] or [-0.092%; +0.092%]	None	None	None
credit_card/is_card_number/378282246310005	throughput	[10464943.553op/s; 10484211.877op/s] or [-0.092%; +0.092%]	None	None	None
credit_card/is_card_number/37828224631000521389798	execution_time	[94.573µs; 94.664µs] or [-0.048%; +0.048%]	None	None	None
credit_card/is_card_number/37828224631000521389798	throughput	[10563826.081op/s; 10573912.747op/s] or [-0.048%; +0.048%]	None	None	None
credit_card/is_card_number/x371413321323331	execution_time	[22.768µs; 22.833µs] or [-0.143%; +0.143%]	None	None	None
credit_card/is_card_number/x371413321323331	throughput	[43800777.131op/s; 43925832.161op/s] or [-0.143%; +0.143%]	None	None	None
credit_card/is_card_number_no_luhn/	execution_time	[1.613µs; 1.614µs] or [-0.005%; +0.005%]	None	None	None
credit_card/is_card_number_no_luhn/	throughput	[619718045.536op/s; 619776159.235op/s] or [-0.005%; +0.005%]	None	None	None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	execution_time	[86.618µs; 86.721µs] or [-0.059%; +0.059%]	None	None	None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	throughput	[11531438.467op/s; 11545082.997op/s] or [-0.059%; +0.059%]	None	None	None
credit_card/is_card_number_no_luhn/ 378282246310005	execution_time	[80.109µs; 80.272µs] or [-0.101%; +0.101%]	None	None	None
credit_card/is_card_number_no_luhn/ 378282246310005	throughput	[12458384.011op/s; 12483485.005op/s] or [-0.101%; +0.101%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631	execution_time	[1.613µs; 1.614µs] or [-0.004%; +0.004%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631	throughput	[619730509.924op/s; 619780692.720op/s] or [-0.004%; +0.004%]	None	None	None
credit_card/is_card_number_no_luhn/378282246310005	execution_time	[78.268µs; 78.399µs] or [-0.084%; +0.084%]	None	None	None
credit_card/is_card_number_no_luhn/378282246310005	throughput	[12755693.134op/s; 12776985.729op/s] or [-0.083%; +0.083%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631000521389798	execution_time	[94.643µs; 94.743µs] or [-0.053%; +0.053%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631000521389798	throughput	[10555066.458op/s; 10566187.824op/s] or [-0.053%; +0.053%]	None	None	None
credit_card/is_card_number_no_luhn/x371413321323331	execution_time	[22.793µs; 22.854µs] or [-0.133%; +0.133%]	None	None	None
credit_card/is_card_number_no_luhn/x371413321323331	throughput	[43759998.357op/s; 43876116.367op/s] or [-0.133%; +0.133%]	None	None	None

Group 3

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`b80115d`	1724331255	vianney/data-pipeline/add-stats-bucket

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
benching deserializing traces from msgpack to their internal representation	execution_time	1.260µs	1.473µs ± 0.059µs	1.481µs ± 0.036µs	1.513µs	1.553µs	1.574µs	1.576µs	6.42%	-1.143	2.733	4.02%	0.004µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
benching deserializing traces from msgpack to their internal representation	execution_time	[1.465µs; 1.481µs] or [-0.558%; +0.558%]	None	None	None

Group 4

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`b80115d`	1724331255	vianney/data-pipeline/add-stats-bucket

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
redis/obfuscate_redis_string	execution_time	38.503µs	38.959µs ± 0.798µs	38.594µs ± 0.041µs	38.658µs	40.628µs	40.721µs	42.275µs	9.54%	1.798	1.713	2.04%	0.056µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
redis/obfuscate_redis_string	execution_time	[38.848µs; 39.070µs] or [-0.284%; +0.284%]	None	None	None

Group 5

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`b80115d`	1724331255	vianney/data-pipeline/add-stats-bucket

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
normalization/normalize_trace/test_trace	execution_time	300.119ns	312.835ns ± 13.113ns	305.839ns ± 4.660ns	319.026ns	340.562ns	341.041ns	342.824ns	12.09%	1.003	-0.412	4.18%	0.927ns	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
normalization/normalize_trace/test_trace	execution_time	[311.018ns; 314.652ns] or [-0.581%; +0.581%]	None	None	None

Group 6

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`b80115d`	1724331255	vianney/data-pipeline/add-stats-bucket

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
tags/replace_trace_tags	execution_time	2.721µs	2.747µs ± 0.010µs	2.748µs ± 0.007µs	2.753µs	2.761µs	2.771µs	2.785µs	1.38%	0.101	0.415	0.37%	0.001µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
tags/replace_trace_tags	execution_time	[2.745µs; 2.748µs] or [-0.051%; +0.051%]	None	None	None

Group 7

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`b80115d`	1724331255	vianney/data-pipeline/add-stats-bucket

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
two way interface	execution_time	17.192µs	23.626µs ± 16.041µs	17.364µs ± 0.049µs	17.432µs	55.276µs	60.443µs	139.268µs	702.06%	3.815	19.757	67.73%	1.134µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
two way interface	execution_time	[21.402µs; 25.849µs] or [-9.410%; +9.410%]	None	None	None

Group 8

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`b80115d`	1724331255	vianney/data-pipeline/add-stats-bucket

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
benching string interning on wordpress profile	execution_time	136.553µs	137.103µs ± 0.214µs	137.077µs ± 0.112µs	137.202µs	137.449µs	137.870µs	138.137µs	0.77%	1.229	3.917	0.16%	0.015µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
benching string interning on wordpress profile	execution_time	[137.073µs; 137.132µs] or [-0.022%; +0.022%]	None	None	None

Group 9

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`b80115d`	1724331255	vianney/data-pipeline/add-stats-bucket

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	execution_time	311.265µs	314.124µs ± 0.951µs	314.152µs ± 0.665µs	314.815µs	315.594µs	315.788µs	315.991µs	0.59%	-0.242	-0.368	0.30%	0.067µs	1	200
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	throughput	3164646.324op/s	3183487.912op/s ± 9646.542op/s	3183173.453op/s ± 6739.484op/s	3189845.791op/s	3200455.488op/s	3205707.479op/s	3212698.562op/s	0.93%	0.257	-0.351	0.30%	682.113op/s	1	200
normalization/normalize_name/normalize_name/bad-name	execution_time	27.958µs	28.036µs ± 0.044µs	28.033µs ± 0.035µs	28.066µs	28.118µs	28.147µs	28.187µs	0.55%	0.582	-0.045	0.16%	0.003µs	1	200
normalization/normalize_name/normalize_name/bad-name	throughput	35477856.019op/s	35669024.234op/s ± 55507.702op/s	35672151.660op/s ± 44927.528op/s	35718066.655op/s	35743940.175op/s	35752977.017op/s	35767642.390op/s	0.27%	-0.574	-0.062	0.16%	3924.987op/s	1	200
normalization/normalize_name/normalize_name/good	execution_time	16.714µs	16.750µs ± 0.023µs	16.740µs ± 0.016µs	16.768µs	16.783µs	16.804µs	16.884µs	0.86%	1.237	3.849	0.14%	0.002µs	1	200
normalization/normalize_name/normalize_name/good	throughput	59228176.550op/s	59701684.371op/s ± 83308.654op/s	59735765.466op/s ± 55678.942op/s	59771008.655op/s	59793821.279op/s	59801310.046op/s	59829536.524op/s	0.16%	-1.219	3.722	0.14%	5890.811op/s	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	execution_time	[313.992µs; 314.256µs] or [-0.042%; +0.042%]	None	None	None
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	throughput	[3182150.994op/s; 3184824.830op/s] or [-0.042%; +0.042%]	None	None	None
normalization/normalize_name/normalize_name/bad-name	execution_time	[28.030µs; 28.042µs] or [-0.022%; +0.022%]	None	None	None
normalization/normalize_name/normalize_name/bad-name	throughput	[35661331.400op/s; 35676717.067op/s] or [-0.022%; +0.022%]	None	None	None
normalization/normalize_name/normalize_name/good	execution_time	[16.747µs; 16.753µs] or [-0.019%; +0.019%]	None	None	None
normalization/normalize_name/normalize_name/good	throughput	[59690138.593op/s; 59713230.149op/s] or [-0.019%; +0.019%]	None	None	None

Group 10

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`b80115d`	1724331255	vianney/data-pipeline/add-stats-bucket

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
write only interface	execution_time	1.417µs	2.913µs ± 1.401µs	2.750µs ± 0.023µs	2.772µs	3.032µs	12.768µs	15.793µs	474.22%	7.961	63.259	47.98%	0.099µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
write only interface	execution_time	[2.719µs; 3.107µs] or [-6.666%; +6.666%]	None	None	None

Group 11

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`b80115d`	1724331255	vianney/data-pipeline/add-stats-bucket

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	execution_time	618.435µs	619.165µs ± 0.438µs	619.096µs ± 0.263µs	619.417µs	619.884µs	620.526µs	621.828µs	0.44%	1.732	6.668	0.07%	0.031µs	1	200
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	throughput	1608161.915op/s	1615080.149op/s ± 1141.617op/s	1615259.681op/s ± 686.184op/s	1615876.308op/s	1616529.832op/s	1616685.042op/s	1616984.111op/s	0.11%	-1.721	6.584	0.07%	80.724op/s	1	200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	execution_time	386.765µs	389.091µs ± 1.043µs	389.114µs ± 0.709µs	389.663µs	390.796µs	391.739µs	393.035µs	1.01%	0.373	0.470	0.27%	0.074µs	1	200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	throughput	2544305.318op/s	2570108.657op/s ± 6881.899op/s	2569937.791op/s ± 4681.540op/s	2574957.154op/s	2580396.659op/s	2583566.154op/s	2585546.630op/s	0.61%	-0.355	0.434	0.27%	486.624op/s	1	200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	execution_time	190.734µs	191.303µs ± 0.200µs	191.280µs ± 0.122µs	191.423µs	191.647µs	191.825µs	191.985µs	0.37%	0.380	0.531	0.10%	0.014µs	1	200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	throughput	5208753.484op/s	5227310.164op/s ± 5475.229op/s	5227951.417op/s ± 3324.599op/s	5230865.068op/s	5235660.819op/s	5238480.169op/s	5242911.483op/s	0.29%	-0.373	0.524	0.10%	387.157op/s	1	200
normalization/normalize_service/normalize_service/[empty string]	execution_time	44.859µs	45.026µs ± 0.098µs	44.993µs ± 0.072µs	45.114µs	45.187µs	45.228µs	45.270µs	0.61%	0.392	-1.082	0.22%	0.007µs	1	200
normalization/normalize_service/normalize_service/[empty string]	throughput	22089802.621op/s	22209399.382op/s ± 48500.019op/s	22225570.197op/s ± 35604.574op/s	22250650.962op/s	22273667.677op/s	22284691.274op/s	22292312.082op/s	0.30%	-0.387	-1.087	0.22%	3429.469op/s	1	200
normalization/normalize_service/normalize_service/test_ASCII	execution_time	49.087µs	49.989µs ± 0.390µs	50.134µs ± 0.158µs	50.262µs	50.403µs	50.494µs	51.073µs	1.87%	-0.848	-0.224	0.78%	0.028µs	1	200
normalization/normalize_service/normalize_service/test_ASCII	throughput	19579959.775op/s	20005462.341op/s ± 157099.578op/s	19946626.538op/s ± 62573.642op/s	20067848.141op/s	20317649.202op/s	20349202.689op/s	20372113.230op/s	2.13%	0.872	-0.222	0.78%	11108.618op/s	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	execution_time	[619.104µs; 619.225µs] or [-0.010%; +0.010%]	None	None	None
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	throughput	[1614921.932op/s; 1615238.366op/s] or [-0.010%; +0.010%]	None	None	None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	execution_time	[388.947µs; 389.236µs] or [-0.037%; +0.037%]	None	None	None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	throughput	[2569154.892op/s; 2571062.422op/s] or [-0.037%; +0.037%]	None	None	None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	execution_time	[191.275µs; 191.331µs] or [-0.015%; +0.015%]	None	None	None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	throughput	[5226551.350op/s; 5228068.978op/s] or [-0.015%; +0.015%]	None	None	None
normalization/normalize_service/normalize_service/[empty string]	execution_time	[45.013µs; 45.040µs] or [-0.030%; +0.030%]	None	None	None
normalization/normalize_service/normalize_service/[empty string]	throughput	[22202677.746op/s; 22216121.019op/s] or [-0.030%; +0.030%]	None	None	None
normalization/normalize_service/normalize_service/test_ASCII	execution_time	[49.935µs; 50.043µs] or [-0.108%; +0.108%]	None	None	None
normalization/normalize_service/normalize_service/test_ASCII	throughput	[19983689.850op/s; 20027234.832op/s] or [-0.109%; +0.109%]	None	None	None

Baseline

Omitted due to size.

codecov-commenter · 2024-08-02T13:17:26Z

Codecov Report

Attention: Patch coverage is 99.78918% with 3 lines in your changes missing coverage. Please review.

Project coverage is 72.90%. Comparing base (a063681) to head (b80115d).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #570      +/-   ##
==========================================
+ Coverage   71.73%   72.90%   +1.17%     
==========================================
  Files         238      241       +3     
  Lines       32941    34364    +1423     
==========================================
+ Hits        23631    25054    +1423     
  Misses       9310     9310

Components	Coverage Δ
crashtracker	`20.66% <ø> (ø)`
datadog-alloc	`98.73% <ø> (ø)`
data-pipeline	`90.12% <99.78%> (+40.12%)`	⬆️
data-pipeline-ffi	`0.00% <ø> (ø)`
ddcommon	`82.11% <ø> (ø)`
ddcommon-ffi	`68.11% <ø> (ø)`
ddtelemetry	`59.02% <ø> (ø)`
ipc	`84.29% <ø> (ø)`
profiling	`84.26% <ø> (ø)`
profiling-ffi	`77.42% <ø> (ø)`
serverless	`0.00% <ø> (ø)`
sidecar	`40.23% <ø> (ø)`
sidecar-ffi	`0.00% <ø> (ø)`
spawn-worker	`54.87% <ø> (ø)`
trace-mini-agent	`70.88% <ø> (ø)`
trace-normalization	`98.25% <ø> (ø)`
trace-obfuscation	`95.73% <ø> (ø)`
trace-protobuf	`77.67% <ø> (ø)`
trace-utils	`93.00% <100.00%> (+0.03%)`	⬆️

ekump · 2024-08-05T19:55:12Z

trace-utils/src/trace_utils.rs

@@ -410,6 +412,13 @@ pub fn compute_top_level_span(trace: &mut [pb::Span]) {
 }
 }

+pub fn has_top_level(span: &pb::Span) -> bool {


These public functions should have unit tests

ekump · 2024-08-05T20:10:31Z

data-pipeline/src/concentrator/mod.rs

+ }
+
+ pub fn add_span(&mut self, span: &pb::Span) -> Result<()> {
+ if !(trace_utils::has_top_level(span)


minor: I think this if statement is complex enough that it should be moved into a separate function for readability / maintainability purposes. Something along the lines of:

fn should_ignore_span(span: &Span, compute_stats_by_span_kind: bool) -> bool { !(trace_utils::has_top_level(span) || trace_utils::is_measured(span) || (self.compute_stats_by_span_kind && compute_stats_for_span_kind)) || trace_utils::is_partial_snapshot(span) }

bantonsson

Looks good in general.

data-pipeline/src/concentrator/aggregation.rs

bantonsson · 2024-08-20T12:37:45Z

data-pipeline/src/concentrator/aggregation.rs

+ pub is_synthetics_request: bool,
+ pub peer_tags: Vec<Tag>,
+ pub is_trace_root: bool,
+}


This is not for this PR, but maybe the AggregationKey should cache a precomputed hash value (since it's immutable anyway), and implement Hash by itself.

data-pipeline/src/concentrator/aggregation.rs

Change concentrator name to be consistent with the trace agent

VianneyRuhlmann requested review from a team as code owners August 2, 2024 13:08

github-actions bot added mini-agent common data-pipeline labels Aug 2, 2024

VianneyRuhlmann force-pushed the vianney/data-pipeline/add-stats-bucket branch from 401f75c to fabcd74 Compare August 5, 2024 16:28

VianneyRuhlmann requested a review from a team as a code owner August 5, 2024 16:31

VianneyRuhlmann requested a review from omerli August 5, 2024 16:31

VianneyRuhlmann added 3 commits August 5, 2024 18:45

Add helpers to trace utils

60a97a6

Move concentrator

409ec48

Fix clippy warnings

6778da6

VianneyRuhlmann force-pushed the vianney/data-pipeline/add-stats-bucket branch from 33cd1cf to 6778da6 Compare August 5, 2024 16:47

VianneyRuhlmann requested review from a team as code owners August 5, 2024 16:47

VianneyRuhlmann marked this pull request as draft August 5, 2024 16:49

VianneyRuhlmann changed the base branch from vianney/data-pipeline/stats-computation to main August 5, 2024 16:49

VianneyRuhlmann removed request for a team and omerli August 5, 2024 16:50

Fix trace utils

b6066cd

github-actions bot removed the common label Aug 5, 2024

VianneyRuhlmann force-pushed the vianney/data-pipeline/add-stats-bucket branch from 89bc718 to c750b0d Compare August 5, 2024 17:02

Add ddsketch to datapipeline

1a295e5

VianneyRuhlmann force-pushed the vianney/data-pipeline/add-stats-bucket branch from c750b0d to 1a295e5 Compare August 5, 2024 17:07

VianneyRuhlmann changed the title ~~Add concentrator~~ [Data-Pipeline] Add concentrator Aug 5, 2024

VianneyRuhlmann marked this pull request as ready for review August 5, 2024 17:34

ekump reviewed Aug 5, 2024

View reviewed changes

VianneyRuhlmann added 2 commits August 6, 2024 11:59

Use helper function to ignore span

bb8ea40

Add test for has_top_level

ad4d617

VianneyRuhlmann changed the title ~~[Data-Pipeline] Add concentrator~~ [Data-Pipeline] [APMSP-1240] Add concentrator Aug 6, 2024

VianneyRuhlmann added 2 commits August 8, 2024 17:21

Merge branch 'main' into vianney/data-pipeline/add-stats-bucket

785f6d5

Merge branch 'main' into vianney/data-pipeline/add-stats-bucket

1295857

bantonsson approved these changes Aug 20, 2024

View reviewed changes

Make struct fields private

60be546

VianneyRuhlmann force-pushed the vianney/data-pipeline/add-stats-bucket branch from c6b0466 to 60be546 Compare August 21, 2024 11:42

VianneyRuhlmann added 2 commits August 21, 2024 14:33

Change visibility to super

827c562

Move tests to a separate module

d198856

VianneyRuhlmann force-pushed the vianney/data-pipeline/add-stats-bucket branch 2 times, most recently from 1133a4e to ea7bee7 Compare August 22, 2024 11:40

Rename Concentrator to SpanConcentrator

a517d6b

Change concentrator name to be consistent with the trace agent

VianneyRuhlmann force-pushed the vianney/data-pipeline/add-stats-bucket branch from ea7bee7 to a517d6b Compare August 22, 2024 11:54

VianneyRuhlmann added 2 commits August 22, 2024 14:31

Merge branch 'main' into vianney/data-pipeline/add-stats-bucket

716a81b

Merge branch 'main' into vianney/data-pipeline/add-stats-bucket

b80115d

VianneyRuhlmann merged commit 9e3cb7c into main Aug 22, 2024
34 checks passed

VianneyRuhlmann deleted the vianney/data-pipeline/add-stats-bucket branch August 22, 2024 13:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Data-Pipeline] [APMSP-1240] Add concentrator #570

[Data-Pipeline] [APMSP-1240] Add concentrator #570

VianneyRuhlmann commented Aug 2, 2024 •

edited

Loading

pr-commenter bot commented Aug 2, 2024 •

edited

Loading

Group 1

Group 2

Group 3

Group 4

Group 5

Group 6

Group 7

Group 8

Group 9

Group 10

Group 11

codecov-commenter commented Aug 2, 2024 •

edited

Loading

ekump Aug 5, 2024

ekump Aug 5, 2024

bantonsson left a comment

bantonsson Aug 20, 2024

[Data-Pipeline] [APMSP-1240] Add concentrator #570

[Data-Pipeline] [APMSP-1240] Add concentrator #570

Conversation

VianneyRuhlmann commented Aug 2, 2024 • edited Loading

What does this PR do?

Motivation

pr-commenter bot commented Aug 2, 2024 • edited Loading

Benchmarks

Comparison

scenario:benching deserializing traces from msgpack to their internal representation

Candidate

Group 1

Group 2

Group 3

Group 4

Group 5

Group 6

Group 7

Group 8

Group 9

Group 10

Group 11

Baseline

codecov-commenter commented Aug 2, 2024 • edited Loading

Codecov Report

ekump Aug 5, 2024

Choose a reason for hiding this comment

ekump Aug 5, 2024

Choose a reason for hiding this comment

bantonsson left a comment

Choose a reason for hiding this comment

bantonsson Aug 20, 2024

Choose a reason for hiding this comment

VianneyRuhlmann commented Aug 2, 2024 •

edited

Loading

pr-commenter bot commented Aug 2, 2024 •

edited

Loading

codecov-commenter commented Aug 2, 2024 •

edited

Loading