Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileStream optimizations #49975

Merged
merged 12 commits into from
Mar 23, 2021
Merged

FileStream optimizations #49975

merged 12 commits into from
Mar 23, 2021

Conversation

jozkee
Copy link
Member

@jozkee jozkee commented Mar 22, 2021

This PR just merges optimizations made in #49145 and #49638 and rebases them in top of the latest changes recently introduced by @adamsitnik in #48813.

Fixes #49541.

I had to do one adjustment (64e5174) in order to satisfy the tests added by #49754 but otherwise, everything else remains as it was in the old PRs.

Perf results:
There is an allocation increase of 16 B I suspect is caused by the newly added fields _share and _length.

BenchmarkDotNet=v0.12.1.1521-nightly, OS=Windows 10.0.19042.868 (20H2/October2020Update)
Intel Core i7-9750H CPU 2.60GHz, 1 CPU, 12 logical and 6 physical cores
.NET SDK=6.0.100-preview.2.21153.28
  [Host]     : .NET 6.0.0 (6.0.21.12307), X64 RyuJIT
  Job-IRUTZB : .NET 6.0.0 (42.42.42.42424), X64 RyuJIT
  Job-LEWZXC : .NET 6.0.0 (42.42.42.42424), X64 RyuJIT

PowerPlanMode=00000000-0000-0000-0000-000000000000  Arguments=/p:DebugType=portable  IterationTime=250.0000 ms  
MaxIterationCount=20  MinIterationCount=15  WarmupCount=1  
Method Toolchain fileSize userBufferSize options Mean Error StdDev Median Min Max Ratio Allocated
Read base 1024 1024 None 42.15 μs 0.409 μs 0.342 μs 42.05 μs 41.67 μs 42.89 μs 1.00 4,328 B
Read feature 1024 1024 None 43.11 μs 0.918 μs 1.057 μs 43.01 μs 41.61 μs 45.08 μs 1.01 4,344 B
Write base 1024 1024 None 1,034.05 μs 21.974 μs 22.565 μs 1,038.34 μs 976.98 μs 1,073.31 μs 1.00 4,328 B
Write feature 1024 1024 None 1,042.44 μs 39.041 μs 44.960 μs 1,045.14 μs 965.48 μs 1,130.16 μs 1.01 4,344 B
ReadAsync base 1024 1024 None 86.86 μs 2.158 μs 2.485 μs 86.53 μs 83.21 μs 92.46 μs 1.00 5,072 B
ReadAsync feature 1024 1024 None 86.38 μs 1.321 μs 1.236 μs 85.92 μs 84.78 μs 88.24 μs 0.99 5,088 B
WriteAsync base 1024 1024 None 1,062.25 μs 44.082 μs 50.765 μs 1,068.88 μs 987.20 μs 1,143.93 μs 1.00 4,416 B
WriteAsync feature 1024 1024 None 1,038.11 μs 21.939 μs 25.265 μs 1,042.38 μs 996.82 μs 1,093.33 μs 0.98 4,432 B
ReadAsync base 1024 1024 Asynchronous 108.90 μs 2.082 μs 2.045 μs 108.18 μs 106.64 μs 113.34 μs 1.00 5,240 B
ReadAsync feature 1024 1024 Asynchronous 102.23 μs 1.785 μs 1.670 μs 101.36 μs 100.33 μs 105.17 μs 0.94 5,256 B
WriteAsync base 1024 1024 Asynchronous 1,173.33 μs 44.106 μs 49.024 μs 1,158.02 μs 1,111.86 μs 1,297.30 μs 1.00 4,826 B
WriteAsync feature 1024 1024 Asynchronous 1,120.89 μs 32.816 μs 36.475 μs 1,114.59 μs 1,056.56 μs 1,177.69 μs 0.96 4,838 B
OpenClose base 1024 ? None 35.24 μs 0.641 μs 0.569 μs 35.30 μs 33.88 μs 36.25 μs 1.00 208 B
OpenClose feature 1024 ? None 33.30 μs 0.582 μs 0.516 μs 33.18 μs 32.78 μs 34.36 μs 0.94 224 B
LockUnlock base 1024 ? None 47.73 μs 1.236 μs 1.374 μs 48.01 μs 45.64 μs 50.13 μs 1.00 208 B
LockUnlock feature 1024 ? None 46.02 μs 0.694 μs 0.649 μs 46.05 μs 44.95 μs 47.27 μs 0.96 224 B
SeekForward base 1024 ? None 679.71 μs 12.643 μs 11.826 μs 677.27 μs 663.22 μs 699.84 μs 1.00 208 B
SeekForward feature 1024 ? None 43.52 μs 0.798 μs 0.746 μs 43.33 μs 42.57 μs 44.90 μs 0.06 224 B
SeekBackward base 1024 ? None 2,409.29 μs 34.724 μs 30.782 μs 2,413.36 μs 2,364.14 μs 2,463.39 μs 1.00 208 B
SeekBackward feature 1024 ? None 1,837.29 μs 33.029 μs 30.895 μs 1,826.06 μs 1,801.63 μs 1,891.95 μs 0.76 224 B
ReadByte base 1024 ? None 43.99 μs 0.570 μs 0.505 μs 43.83 μs 43.33 μs 45.06 μs 1.00 4,328 B
ReadByte feature 1024 ? None 44.32 μs 0.822 μs 0.729 μs 44.12 μs 43.43 μs 45.92 μs 1.01 4,344 B
WriteByte base 1024 ? None 1,107.18 μs 27.937 μs 29.892 μs 1,116.16 μs 1,051.06 μs 1,141.02 μs 1.00 4,328 B
WriteByte feature 1024 ? None 1,147.78 μs 50.184 μs 57.792 μs 1,134.50 μs 1,058.34 μs 1,282.50 μs 1.04 4,344 B
Flush base 1024 ? None 4,671.56 μs 268.251 μs 308.919 μs 4,777.24 μs 4,159.09 μs 5,111.56 μs 1.00 4,329 B
Flush feature 1024 ? None 4,739.84 μs 299.293 μs 344.666 μs 4,916.42 μs 4,233.60 μs 5,190.47 μs 1.02 4,345 B
FlushAsync base 1024 ? None 6,215.18 μs 323.164 μs 372.156 μs 6,392.51 μs 5,676.52 μs 6,679.63 μs 1.00 275,049 B
FlushAsync feature 1024 ? None 6,268.75 μs 355.372 μs 409.247 μs 6,446.29 μs 5,754.60 μs 6,804.16 μs 1.01 275,065 B
CopyToFile base 1024 ? None 1,106.93 μs 39.110 μs 43.471 μs 1,107.62 μs 1,047.41 μs 1,196.00 μs 1.00 4,538 B
CopyToFile feature 1024 ? None 1,150.55 μs 34.885 μs 38.775 μs 1,156.79 μs 1,088.84 μs 1,213.33 μs 1.04 4,569 B
CopyToFileAsync base 1024 ? None 1,267.53 μs 45.676 μs 52.601 μs 1,266.82 μs 1,192.55 μs 1,349.69 μs 1.00 5,562 B
CopyToFileAsync feature 1024 ? None 1,262.80 μs 39.062 μs 41.796 μs 1,269.05 μs 1,188.39 μs 1,347.70 μs 1.00 5,596 B
OpenClose base 1024 ? Asynchronous 34.54 μs 0.473 μs 0.419 μs 34.43 μs 33.91 μs 35.57 μs 1.00 256 B
OpenClose feature 1024 ? Asynchronous 34.97 μs 0.652 μs 0.610 μs 34.79 μs 34.12 μs 35.99 μs 1.01 272 B
LockUnlock base 1024 ? Asynchronous 49.49 μs 0.932 μs 0.915 μs 49.15 μs 48.32 μs 51.42 μs 1.00 256 B
LockUnlock feature 1024 ? Asynchronous 50.07 μs 0.985 μs 0.873 μs 49.91 μs 49.18 μs 51.92 μs 1.01 272 B
SeekForward base 1024 ? Asynchronous 2,621.18 μs 33.964 μs 31.770 μs 2,631.60 μs 2,576.36 μs 2,670.52 μs 1.00 256 B
SeekForward feature 1024 ? Asynchronous 44.94 μs 0.763 μs 0.714 μs 44.62 μs 44.00 μs 46.49 μs 0.02 272 B
SeekBackward base 1024 ? Asynchronous 5,290.45 μs 91.791 μs 85.861 μs 5,277.68 μs 5,169.49 μs 5,421.50 μs 1.00 257 B
SeekBackward feature 1024 ? Asynchronous 2,678.27 μs 46.203 μs 43.218 μs 2,659.78 μs 2,621.72 μs 2,748.75 μs 0.51 272 B
ReadByte base 1024 ? Asynchronous 67.08 μs 1.265 μs 1.406 μs 66.66 μs 65.48 μs 69.48 μs 1.00 4,698 B
ReadByte feature 1024 ? Asynchronous 61.55 μs 1.344 μs 1.547 μs 61.35 μs 59.63 μs 64.36 μs 0.92 4,715 B
WriteByte base 1024 ? Asynchronous 1,232.27 μs 34.232 μs 39.421 μs 1,229.81 μs 1,177.33 μs 1,311.58 μs 1.00 4,727 B
WriteByte feature 1024 ? Asynchronous 1,199.37 μs 43.552 μs 50.155 μs 1,195.56 μs 1,127.21 μs 1,292.33 μs 0.97 4,755 B
Flush base 1024 ? Asynchronous 47,875.35 μs 2,339.326 μs 2,693.971 μs 47,131.61 μs 44,045.60 μs 53,199.18 μs 1.00 155,908 B
Flush feature 1024 ? Asynchronous 14,945.85 μs 494.672 μs 569.665 μs 14,670.86 μs 14,347.10 μs 16,130.53 μs 0.31 152,444 B
FlushAsync base 1024 ? Asynchronous 65,640.53 μs 3,328.732 μs 3,699.877 μs 66,847.73 μs 57,764.97 μs 69,690.27 μs 1.00 307,926 B
FlushAsync feature 1024 ? Asynchronous 17,211.85 μs 575.562 μs 662.818 μs 17,156.76 μs 16,383.39 μs 18,551.23 μs 0.26 307,947 B
CopyToFileAsync base 1024 ? Asynchronous 1,444.81 μs 48.168 μs 55.470 μs 1,452.91 μs 1,310.17 μs 1,533.31 μs 1.00 6,219 B
CopyToFileAsync feature 1024 ? Asynchronous 1,838.93 μs 251.001 μs 278.987 μs 1,855.54 μs 1,358.64 μs 2,233.14 μs 1.27 6,191 B
Read base 1048576 512 None 1,054.38 μs 137.667 μs 158.537 μs 984.19 μs 896.97 μs 1,362.93 μs 1.00 4,328 B
Read feature 1048576 512 None 821.17 μs 16.453 μs 18.287 μs 816.62 μs 796.64 μs 864.23 μs 0.79 4,344 B
Write base 1048576 512 None 6,409.50 μs 261.978 μs 280.314 μs 6,381.70 μs 6,035.65 μs 7,055.64 μs 1.00 4,329 B
Write feature 1048576 512 None 6,052.51 μs 206.714 μs 238.052 μs 6,025.02 μs 5,631.29 μs 6,402.86 μs 0.94 4,345 B
ReadAsync base 1048576 512 None 1,660.33 μs 103.631 μs 119.342 μs 1,696.76 μs 1,479.36 μs 1,877.67 μs 1.00 86,671 B
ReadAsync feature 1048576 512 None 1,493.91 μs 33.582 μs 38.673 μs 1,480.58 μs 1,439.02 μs 1,580.41 μs 0.90 86,686 B
WriteAsync base 1048576 512 None 8,570.25 μs 503.775 μs 580.147 μs 8,503.81 μs 7,755.54 μs 9,810.12 μs 1.00 78,194 B
WriteAsync feature 1048576 512 None 7,507.00 μs 271.475 μs 312.631 μs 7,496.33 μs 7,008.36 μs 8,183.34 μs 0.88 78,313 B
ReadAsync base 1048576 512 Asynchronous 7,345.60 μs 298.758 μs 344.050 μs 7,448.88 μs 6,638.23 μs 7,753.77 μs 1.00 94,982 B
ReadAsync feature 1048576 512 Asynchronous 4,035.77 μs 104.221 μs 120.022 μs 4,043.89 μs 3,822.51 μs 4,289.82 μs 0.55 95,010 B
WriteAsync base 1048576 512 Asynchronous 21,952.38 μs 931.408 μs 996.596 μs 21,620.60 μs 20,673.96 μs 24,621.32 μs 1.00 86,598 B
WriteAsync feature 1048576 512 Asynchronous 11,898.93 μs 409.308 μs 471.360 μs 11,744.84 μs 11,238.01 μs 12,783.82 μs 0.54 86,691 B
Read base 1048576 4096 None 779.65 μs 11.151 μs 10.431 μs 778.89 μs 764.15 μs 795.34 μs 1.00 208 B
Read feature 1048576 4096 None 769.84 μs 13.442 μs 12.574 μs 763.42 μs 757.29 μs 789.30 μs 0.99 224 B
Write base 1048576 4096 None 6,123.14 μs 182.392 μs 202.728 μs 6,067.18 μs 5,808.37 μs 6,607.42 μs 1.00 209 B
Write feature 1048576 4096 None 6,263.92 μs 334.956 μs 372.302 μs 6,149.22 μs 5,788.45 μs 7,113.08 μs 1.02 225 B
ReadAsync base 1048576 4096 None 1,199.60 μs 21.477 μs 20.090 μs 1,201.32 μs 1,170.83 μs 1,245.46 μs 1.00 29,304 B
ReadAsync feature 1048576 4096 None 1,249.15 μs 23.808 μs 22.270 μs 1,254.30 μs 1,208.98 μs 1,280.63 μs 1.04 29,320 B
WriteAsync base 1048576 4096 None 7,255.20 μs 260.501 μs 299.993 μs 7,161.12 μs 6,891.52 μs 7,986.36 μs 1.00 55,975 B
WriteAsync feature 1048576 4096 None 7,308.10 μs 239.714 μs 266.442 μs 7,328.21 μs 6,883.72 μs 7,776.77 μs 1.01 55,993 B
ReadAsync base 1048576 4096 Asynchronous 7,090.25 μs 147.726 μs 170.122 μs 7,053.45 μs 6,865.73 μs 7,429.79 μs 1.00 80,509 B
ReadAsync feature 1048576 4096 Asynchronous 3,594.29 μs 62.789 μs 58.733 μs 3,591.26 μs 3,510.61 μs 3,689.29 μs 0.51 80,481 B
WriteAsync base 1048576 4096 Asynchronous 22,194.85 μs 1,208.960 μs 1,343.756 μs 22,192.11 μs 20,459.74 μs 25,899.69 μs 1.00 85,789 B
WriteAsync feature 1048576 4096 Asynchronous 11,185.35 μs 364.893 μs 390.431 μs 11,060.87 μs 10,781.64 μs 12,278.44 μs 0.50 85,880 B
Read_NoBuffering base 1048576 16384 None 250.69 μs 4.669 μs 4.368 μs 248.68 μs 245.63 μs 257.51 μs 1.00 144 B
Read_NoBuffering feature 1048576 16384 None 249.80 μs 3.842 μs 4.270 μs 247.96 μs 245.10 μs 257.09 μs 1.00 160 B
Write_NoBuffering base 1048576 16384 None 23,490.05 μs 445.841 μs 417.040 μs 23,318.99 μs 23,041.92 μs 24,362.22 μs 1.00 149 B
Write_NoBuffering feature 1048576 16384 None 23,552.72 μs 390.004 μs 364.810 μs 23,326.66 μs 23,137.19 μs 24,126.65 μs 1.00 394 B
ReadAsync_NoBuffering base 1048576 16384 None 460.01 μs 4.112 μs 3.645 μs 460.09 μs 454.56 μs 467.22 μs 1.00 7,648 B
ReadAsync_NoBuffering feature 1048576 16384 None 451.71 μs 3.358 μs 2.977 μs 451.50 μs 445.98 μs 456.70 μs 0.98 7,664 B
WriteAsync_NoBuffering base 1048576 16384 None 24,472.46 μs 310.763 μs 290.688 μs 24,480.76 μs 24,053.21 μs 25,004.18 μs 1.00 7,738 B
WriteAsync_NoBuffering feature 1048576 16384 None 24,494.74 μs 407.382 μs 381.065 μs 24,415.37 μs 23,963.72 μs 25,341.92 μs 1.00 8,206 B
ReadAsync_NoBuffering base 1048576 16384 Asynchronous 1,864.75 μs 36.850 μs 40.959 μs 1,859.62 μs 1,803.44 μs 1,947.62 μs 1.00 20,408 B
ReadAsync_NoBuffering feature 1048576 16384 Asynchronous 1,085.50 μs 26.299 μs 29.232 μs 1,089.04 μs 1,017.47 μs 1,131.86 μs 0.58 20,424 B
WriteAsync_NoBuffering base 1048576 16384 Asynchronous 29,825.94 μs 564.912 μs 580.123 μs 29,780.46 μs 28,886.20 μs 31,051.36 μs 1.00 20,426 B
WriteAsync_NoBuffering feature 1048576 16384 Asynchronous 26,490.92 μs 524.760 μs 490.861 μs 26,350.99 μs 25,964.88 μs 27,401.54 μs 0.89 20,429 B
CopyToFile base 1048576 ? None 5,078.46 μs 123.656 μs 142.403 μs 5,054.93 μs 4,830.58 μs 5,308.51 μs 1.00 423 B
CopyToFile feature 1048576 ? None 5,106.71 μs 121.515 μs 139.937 μs 5,112.16 μs 4,889.01 μs 5,405.29 μs 1.01 455 B
CopyToFileAsync base 1048576 ? None 5,499.19 μs 101.362 μs 99.551 μs 5,493.06 μs 5,361.38 μs 5,695.96 μs 1.00 3,217 B
CopyToFileAsync feature 1048576 ? None 5,472.62 μs 87.302 μs 77.391 μs 5,464.24 μs 5,362.25 μs 5,589.45 μs 0.99 3,259 B
CopyToFileAsync base 1048576 ? Asynchronous 7,040.58 μs 137.673 μs 153.023 μs 7,057.62 μs 6,797.71 μs 7,368.77 μs 1.00 4,238 B
CopyToFileAsync feature 1048576 ? Asynchronous 6,715.75 μs 130.221 μs 127.895 μs 6,702.73 μs 6,519.45 μs 6,932.49 μs 0.95 4,256 B
Read base 104857600 4096 None 92,104.99 μs 1,296.098 μs 1,212.370 μs 91,663.05 μs 90,452.25 μs 94,162.40 μs 1.00 220 B
Read feature 104857600 4096 None 91,536.81 μs 1,149.948 μs 1,019.399 μs 91,276.55 μs 90,328.73 μs 93,809.38 μs 0.99 236 B
Write base 104857600 4096 None 141,539.82 μs 3,725.871 μs 3,986.640 μs 140,613.80 μs 135,893.60 μs 149,638.80 μs 1.00 256 B
Write feature 104857600 4096 None 142,984.74 μs 2,718.416 μs 2,542.808 μs 142,343.80 μs 140,064.00 μs 149,534.30 μs 1.01 272 B
ReadAsync base 104857600 4096 None 126,035.11 μs 2,331.713 μs 2,181.086 μs 126,048.35 μs 122,238.45 μs 128,763.55 μs 1.00 2,869,000 B
ReadAsync feature 104857600 4096 None 124,737.77 μs 2,624.516 μs 3,022.397 μs 124,263.02 μs 120,696.70 μs 131,166.20 μs 0.99 2,869,104 B
WriteAsync base 104857600 4096 None 245,432.67 μs 20,305.908 μs 21,727.094 μs 235,492.40 μs 222,986.10 μs 291,460.00 μs 1.00 5,126,400 B
WriteAsync feature 104857600 4096 None 212,691.08 μs 5,451.753 μs 5,833.315 μs 211,909.40 μs 203,383.80 μs 225,193.70 μs 0.87 5,124,816 B
ReadAsync base 104857600 4096 Asynchronous 705,778.75 μs 13,299.438 μs 11,789.605 μs 702,970.95 μs 691,019.90 μs 733,272.60 μs 1.00 7,988,656 B
ReadAsync feature 104857600 4096 Asynchronous 365,151.64 μs 5,520.016 μs 5,163.426 μs 365,748.10 μs 356,890.40 μs 376,450.80 μs 0.52 7,988,592 B
WriteAsync base 104857600 4096 Asynchronous 2,421,993.96 μs 228,717.280 μs 254,218.651 μs 2,369,430.40 μs 2,111,048.40 μs 2,937,009.00 μs 1.00 8,094,752 B
WriteAsync feature 104857600 4096 Asynchronous 498,878.68 μs 12,640.036 μs 12,980.387 μs 497,878.40 μs 482,248.80 μs 527,246.50 μs 0.21 8,096,136 B
Read_NoBuffering base 104857600 16384 None 36,768.47 μs 958.519 μs 1,103.832 μs 36,831.48 μs 34,881.78 μs 38,585.25 μs 1.00 152 B
Read_NoBuffering feature 104857600 16384 None 35,123.74 μs 697.593 μs 716.377 μs 34,966.59 μs 33,963.21 μs 36,597.23 μs 0.95 167 B
Write_NoBuffering base 104857600 16384 None 60,559.25 μs 1,765.154 μs 1,888.695 μs 59,959.00 μs 57,659.65 μs 64,166.00 μs 1.00 728 B
Write_NoBuffering feature 104857600 16384 None 62,577.38 μs 3,531.058 μs 4,066.372 μs 61,590.12 μs 57,044.93 μs 68,937.48 μs 1.03 744 B
ReadAsync_NoBuffering base 104857600 16384 None 48,480.72 μs 1,039.224 μs 1,196.772 μs 48,352.14 μs 45,839.95 μs 50,579.90 μs 1.00 717,292 B
ReadAsync_NoBuffering feature 104857600 16384 None 48,983.83 μs 453.513 μs 424.216 μs 49,049.00 μs 48,355.55 μs 49,586.40 μs 1.01 717,308 B
WriteAsync_NoBuffering base 104857600 16384 None 69,978.50 μs 2,263.067 μs 2,421.457 μs 69,716.90 μs 66,493.05 μs 74,234.95 μs 1.00 717,866 B
WriteAsync_NoBuffering feature 104857600 16384 None 72,189.72 μs 2,454.820 μs 2,626.630 μs 72,206.40 μs 68,511.23 μs 78,521.40 μs 1.03 717,882 B
ReadAsync_NoBuffering base 104857600 16384 Asynchronous 195,393.84 μs 5,908.521 μs 6,567.306 μs 193,028.80 μs 187,306.60 μs 210,169.60 μs 1.00 1,997,288 B
ReadAsync_NoBuffering feature 104857600 16384 Asynchronous 106,702.51 μs 2,112.871 μs 2,169.763 μs 106,223.95 μs 103,839.30 μs 110,466.70 μs 0.55 1,997,952 B
WriteAsync_NoBuffering base 104857600 16384 Asynchronous 583,151.32 μs 79,464.383 μs 81,604.071 μs 563,621.40 μs 495,295.70 μs 779,063.40 μs 1.00 1,998,896 B
WriteAsync_NoBuffering feature 104857600 16384 Asynchronous 152,822.71 μs 5,842.803 μs 6,251.734 μs 150,639.35 μs 145,130.50 μs 166,902.10 μs 0.27 1,997,304 B
CopyToFile base 104857600 ? None 57,322.51 μs 1,303.006 μs 1,448.288 μs 56,689.82 μs 55,256.95 μs 60,040.78 μs 1.00 1,072 B
CopyToFile feature 104857600 ? None 57,167.15 μs 810.539 μs 718.521 μs 56,955.45 μs 56,365.05 μs 58,591.82 μs 0.99 994 B
CopyToFileAsync base 104857600 ? None 65,886.27 μs 1,733.748 μs 1,996.587 μs 65,805.48 μs 62,745.60 μs 70,337.43 μs 1.00 181,298 B
CopyToFileAsync feature 104857600 ? None 65,663.41 μs 1,048.440 μs 1,076.671 μs 65,624.70 μs 63,740.40 μs 68,090.98 μs 1.00 181,168 B
CopyToFileAsync base 104857600 ? Asynchronous 151,415.33 μs 9,692.878 μs 11,162.333 μs 151,676.40 μs 136,253.30 μs 171,368.00 μs 1.00 251,656 B
CopyToFileAsync feature 104857600 ? Asynchronous 103,349.35 μs 2,303.080 μs 2,559.867 μs 103,683.20 μs 99,412.30 μs 109,105.25 μs 0.68 251,512 B

@jozkee jozkee added this to the 6.0.0 milestone Mar 22, 2021
@jozkee jozkee self-assigned this Mar 22, 2021
@adamsitnik adamsitnik added the tenet-performance Performance related issue label Mar 22, 2021
Copy link
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me, but I've found few minor things that should be improved before we merge the PR.

@jozkee could you please provide the benchmark numbers including the new NoBuffering benchmarks from #1724 ?

jozkee and others added 2 commits March 22, 2021 15:06
@jozkee jozkee requested a review from adamsitnik March 23, 2021 03:44
Copy link
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jozkee the perf results look amazing <3 (up to two times faster ReadAsync and up to five times faster WriteAsync!!)

Could you please re-run the following benchmarks:

Faster Read and Write sync - I am surprised that we have gained something here as we have more or less not touched the sync code path besides using a different set of parameters for ReadFile and WriteFile? (it's great to have the gain, but it would be good to understand it)

Method Toolchain fileSize userBufferSize options Mean Error StdDev Median Min Max Ratio Allocated
Read base 1048576 512 None 1,054.38 μs 137.667 μs 158.537 μs 984.19 μs 896.97 μs 1,362.93 μs 1.00 4,328 B
Read feature 1048576 512 None 821.17 μs 16.453 μs 18.287 μs 816.62 μs 796.64 μs 864.23 μs 0.79 4,344 B
Write base 1048576 512 None 6,409.50 μs 261.978 μs 280.314 μs 6,381.70 μs 6,035.65 μs 7,055.64 μs 1.00 4,329 B
Write feature 1048576 512 None 6,052.51 μs 206.714 μs 238.052 μs 6,025.02 μs 5,631.29 μs 6,402.86 μs 0.94 4,345 B

Why copying small files have regressed?

Method Toolchain fileSize userBufferSize options Mean Error StdDev Median Min Max Ratio Allocated
CopyToFileAsync base 1024 ? Asynchronous 1,444.81 μs 48.168 μs 55.470 μs 1,452.91 μs 1,310.17 μs 1,533.31 μs 1.00 6,219 B
CopyToFileAsync feature 1024 ? Asynchronous 1,838.93 μs 251.001 μs 278.987 μs 1,855.54 μs 1,358.64 μs 2,233.14 μs 1.27 6,191 B

Since most of the results look great and we have just one tiny regression that can be an outlier I am going to squash it right now (the sooner we do that the sooner we can test other repos if our changes have broken something)

@jozkee
Copy link
Member Author

jozkee commented Mar 24, 2021

Could you please re-run the following benchmarks:

@adamsitnik I ran the suggested benchmarks again and the results show no regression nor improvement vs base (main), it seems to me that the benchmarks are a bit unstable.

Read/Write sync:

| Method |           Toolchain | fileSize | userBufferSize | options |       Mean |     Error |    StdDev |     Median |        Min |        Max | Ratio | Allocated |
|------- |-------------------- |--------- |--------------- |-------- |-----------:|----------:|----------:|-----------:|-----------:|-----------:|------:|----------:|
|   Read |   \base\CoreRun.exe |  1048576 |            512 |    None |   841.1 us |   7.18 us |   6.00 us |   840.9 us |   833.2 us |   854.1 us |  1.00 |      4 KB |
|   Read |\feature\CoreRun.exe |  1048576 |            512 |    None |   842.1 us |   8.90 us |   8.32 us |   843.6 us |   826.5 us |   857.3 us |  1.00 |      4 KB |
|        |                     |          |                |         |            |           |           |            |            |            |       |           |
|  Write |   \base\CoreRun.exe |  1048576 |            512 |    None | 6,825.0 us | 350.07 us | 403.15 us | 6,736.7 us | 6,310.4 us | 7,601.0 us |  1.00 |      4 KB |
|  Write |\feature\CoreRun.exe |  1048576 |            512 |    None | 6,834.6 us | 203.73 us | 226.44 us | 6,827.1 us | 6,507.2 us | 7,296.0 us |  1.01 |      4 KB |

CopyToFileAsync 1024 file size:

|          Method |           Toolchain |  fileSize |      options |       Mean |      Error |     StdDev |     Median |        Min |        Max | Ratio | Allocated |
|---------------- |-------------------- |---------- |------------- |-----------:|-----------:|-----------:|-----------:|-----------:|-----------:|------:|----------:|
| CopyToFileAsync |   \base\CoreRun.exe |      1024 |         None |   1.561 ms |  0.0748 ms |  0.0861 ms |   1.543 ms |   1.427 ms |   1.742 ms |  1.00 |      5 KB |
| CopyToFileAsync |\feature\CoreRun.exe |      1024 |         None |   1.586 ms |  0.0464 ms |  0.0534 ms |   1.576 ms |   1.486 ms |   1.696 ms |  1.02 |      5 KB |
|                 |                     |           |              |            |            |            |            |            |            |       |           |
| CopyToFileAsync |   \base\CoreRun.exe |      1024 | Asynchronous |   1.847 ms |  0.1436 ms |  0.1596 ms |   1.853 ms |   1.597 ms |   2.099 ms |  1.00 |      6 KB |
| CopyToFileAsync |\feature\CoreRun.exe |      1024 | Asynchronous |   1.747 ms |  0.0747 ms |  0.0831 ms |   1.768 ms |   1.626 ms |   1.951 ms |  0.95 |      6 KB |

@jozkee jozkee deleted the fs_perf branch March 24, 2021 19:29
@adamsitnik adamsitnik added the breaking-change Issue or PR that represents a breaking API or functional change over a prerelease. label Apr 7, 2021
@ghost ghost added the needs-breaking-change-doc-created Breaking changes need an issue opened with https://github.com/dotnet/docs/issues/new?template=dotnet label Apr 7, 2021
@ghost ghost locked as resolved and limited conversation to collaborators May 7, 2021
@jozkee
Copy link
Member Author

jozkee commented Oct 15, 2021

Breaking change doc created in dotnet/docs#24060.

@jozkee jozkee removed the needs-breaking-change-doc-created Breaking changes need an issue opened with https://github.com/dotnet/docs/issues/new?template=dotnet label Oct 15, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.IO breaking-change Issue or PR that represents a breaking API or functional change over a prerelease. tenet-performance Performance related issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Avoid expensive GetFileInformationByHandleEx syscall if possible
2 participants