Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: Fixes Reduces AuthorizationHelper memory allocations #1712

Merged
merged 11 commits into from
Jul 28, 2020

Conversation

kirankumarkolli
Copy link
Member

@kirankumarkolli kirankumarkolli commented Jul 22, 2020

Including exclusive auth performance micro benchmark.
Changes includes:

  1. Caching the HMACSHA256
  2. Contract adjusted to avoid leaking bytes out

After the changes
RPS: 2-3% UP
Latency: 3-6% UP
GEN0: 30% DOWN
Allocations: 28% DOWN

Method Mean Error StdDev Q3 P80 P85 P90 P95 P100 Op/s Gen 0 Allocated
CreateSignatureGeneration 3.579 us 0.0336 us 0.0280 us 3.603 us 3.605 us 3.612 us 3.612 us 3.614 us 3.618 us 279,387.0 0.0458 896 B
ReadSignatureGeneration 3.542 us 0.0490 us 0.0458 us 3.576 us 3.577 us 3.580 us 3.598 us 3.617 us 3.633 us 282,301.5 0.0458 904 B

Before changes:
Branch with-out Auth changes & perf benchmark: users/kirankk/auth_baseline
NOTE: Please change the target to netcoreapp3.1

OS Allocations %Change GC %Change RPs %Change
WINDOWS ~174B Less 11% Less GC 7-9% less
LINUX ~174B Less 10% Less GC 13-15% less

Also the latency is higher with the new optimization.

  • Is high latency and less RPS results of high CPU work?
  • If so if it worth trading for allocations?

Op/s/Rps : Operation per second
Gen 0 : GC Generation 0 collects per 1000 operations
Gen 1 : GC Generation 1 collects per 1000 operations
Gen 2 : GC Generation 2 collects per 1000 operations
Allocated : Allocated memory per single operation (managed only, inclusive, 1KB = 1024B)

BenchmarkDotNet=v0.12.0, OS=Windows 10.0.19041
Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.1.302
[Host] : .NET Core 3.1.6 (CoreCLR 4.700.20.26901, CoreFX 4.700.20.31603), X64 RyuJIT
DefaultJob : .NET Core 3.1.6 (CoreCLR 4.700.20.26901, CoreFX 4.700.20.31603), X64 RyuJIT

Method Mean Error StdDev Median Q3 P80 P85 P90 P95 P100 Op/s Gen 0 Gen 1 Gen 2 Allocated
CreateSignatureGeneration 4.627 us 0.2003 us 0.5683 us 4.338 us 4.835 us 5.073 us 5.226 us 5.402 us 6.054 us 6.398 us 216,126.1 0.3815 - - 1.57 KB
ReadSignatureGeneration 4.428 us 0.0907 us 0.2276 us 4.361 us 4.516 us 4.559 us 4.652 us 4.811 us 4.956 us 5.058 us 225,846.0 0.3891 - - 1.6 KB
Method Mean Error StdDev Median Q3 P80 P85 P90 P95 P100 Op/s Gen 0 Gen 1 Gen 2 Allocated
CreateSignatureGeneration 4.967 us 0.0997 us 0.2610 us 4.884 us 5.122 us 5.157 us 5.240 us 5.336 us 5.488 us 5.765 us 201,334.0 0.3433 - - 1.41 KB
ReadSignatureGeneration 4.847 us 0.0940 us 0.1045 us 4.851 us 4.927 us 4.930 us 4.940 us 4.962 us 5.004 us 5.036 us 206,299.5 0.3433 - - 1.43 KB

BenchmarkDotNet=v0.12.0, OS=ubuntu 18.04
Intel Xeon Platinum 8171M CPU 2.60GHz, 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.1.301
[Host] : .NET Core 3.1.5 (CoreCLR 4.700.20.26901, CoreFX 4.700.20.27001), X64 RyuJIT
DefaultJob : .NET Core 3.1.5 (CoreCLR 4.700.20.26901, CoreFX 4.700.20.27001), X64 RyuJIT

Method Mean Error StdDev Q3 P80 P85 P90 P95 P100 Op/s Completed Work Items Lock Contentions Gen 0 Gen 1 Gen 2 Allocated
CreateSignatureGeneration 5.302 us 0.0665 us 0.0555 us 5.344 us 5.345 us 5.353 us 5.371 us 5.396 us 5.425 us 188,590.9 0.0000 - 0.0763 - - 1.48 KB
ReadSignatureGeneration 5.369 us 0.0743 us 0.0695 us 5.412 us 5.420 us 5.445 us 5.462 us 5.479 us 5.499 us 186,251.0 0.0000 - 0.0763 - - 1.48 KB
Method Mean Error StdDev Q3 P80 P85 P90 P95 P100 Op/s Completed Work Items Lock Contentions Gen 0 Gen 1 Gen 2 Allocated
CreateSignatureGeneration 6.078 us 0.0925 us 0.0772 us 6.145 us 6.148 us 6.159 us 6.165 us 6.169 us 6.172 us 164,537.6 0.0000 - 0.0687 - - 1.3 KB
ReadSignatureGeneration 6.317 us 0.0777 us 0.0726 us 6.371 us 6.375 us 6.388 us 6.396 us 6.420 us 6.465 us 158,313.6 0.0000 - 0.0687 - - 1.33 KB

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow the required format: "[Internal] Category: (Adds|Fixes|Refactors) Description"

Examples:
Diagnostics: Adds GetElapsedClientLatency to CosmosDiagnostics
PartitionKey: Fixes null reference when using default(PartitionKey)
[v4] Client Encryption: Refactors code to external project
[Internal] Query: Adds code generator for CosmosNumbers for easy additions in the future.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow the required format: "[Internal] Category: (Adds|Fixes|Refactors) Description"

Examples:
Diagnostics: Adds GetElapsedClientLatency to CosmosDiagnostics
PartitionKey: Fixes null reference when using default(PartitionKey)
[v4] Client Encryption: Refactors code to external project
[Internal] Query: Adds code generator for CosmosNumbers for easy additions in the future.

@kirankumarkolli
Copy link
Member Author

AUTH EXCLUDED

Method Median P90 P95 Gen 0 Gen 1 Gen 2 Allocated
CreateItem 993.2 us 1,301.7 us 1,376.3 us - - - 38.7 KB
DeleteItemExists 800.5 us 1,081.3 us 1,126.7 us - - - 38.71 KB
DeleteItemNotExists 1,422.2 us 2,265.5 us 2,697.1 us - - - 55.7 KB
ReadFeed 1,006.0 us 1,725.8 us 2,062.9 us - - - 52.02 KB
ReadItemExists 1,491.2 us 3,168.5 us 3,419.7 us - - - 43.68 KB
ReadItemNotExists 1,668.4 us 3,304.9 us 3,599.1 us - - - 60.65 KB
UpdateItem 1,473.7 us 2,930.9 us 3,454.5 us - - - 38.93 KB
UpsertItem 1,395.8 us 2,971.6 us 3,205.8 us - - - 38.84 KB

AUTH BEFORE PERF CHANGES

Method Median P90 P95 Gen 0 Gen 1 Gen 2 Allocated
CreateItem 966.3 us 1,559.0 us 1,696.5 us - - - 40.57 KB
DeleteItemExists 828.2 us 1,233.3 us 1,369.3 us - - - 40.67 KB
DeleteItemNotExists 924.1 us 1,434.4 us 1,563.2 us - - - 57.62 KB
ReadFeed 1,013.1 us 1,598.4 us 1,852.3 us - - - 53.64 KB
ReadItemExists 953.9 us 2,114.5 us 2,307.2 us - - - 45.61 KB
ReadItemNotExists 1,015.2 us 1,794.7 us 2,146.4 us - - - 62.56 KB
UpdateItem 1,005.8 us 1,598.0 us 1,846.0 us - - - 40.88 KB
UpsertItem 829.0 us 1,656.9 us 1,905.7 us - - - 40.73 KB

AUTH AFTER PERF CHANGES

Method Median P90 P95 Gen 0 Gen 1 Gen 2 Allocated
CreateItem 879.2 us 1,583.2 us 1,792.8 us - - - 45.02 KB
DeleteItemExists 832.0 us 1,389.4 us 1,689.5 us - - - 44.5 KB
DeleteItemNotExists 1,567.2 us 1,976.8 us 2,191.8 us - - - 61.48 KB
ReadFeed 1,416.6 us 3,028.2 us 3,230.3 us - - - 57.52 KB
ReadItemExists 1,501.0 us 3,445.5 us 3,695.7 us - - - 49.45 KB
ReadItemNotExists 993.8 us 1,595.4 us 1,697.7 us - - - 66.45 KB
UpdateItem 1,164.2 us 3,025.6 us 3,383.5 us - - - 44.69 KB
UpsertItem 1,007.4 us 1,721.6 us 2,084.4 us - - - 44.63 KB

@kirankumarkolli kirankumarkolli changed the title [Internal] MicroBenchmark: Include auth into micro benchmark [Internal] MicroBenchmark: Adds Include auth into micro benchmark Jul 22, 2020
@github-actions github-actions bot dismissed stale reviews from themself July 22, 2020 12:39

All good!

@kirankumarkolli
Copy link
Member Author

NOTE: Runs are from dev machine, latenciess might be unreliable. Lets just focus on allocations and GC's

Surprisingly allocations are more higher with the new implementation.

Host context

BenchmarkDotNet=v0.12.0, OS=Windows 10.0.19041
Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.1.302
[Host] : .NET Core 3.0.3 (CoreCLR 4.700.20.6603, CoreFX 4.700.20.6701), X64 RyuJIT

@kirankumarkolli
Copy link
Member Author

kirankumarkolli commented Jul 22, 2020

LONG Job run details

BenchmarkDotNet=v0.12.0, OS=Windows 10.0.19041
Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.1.302
[Host] : .NET Core 3.0.3 (CoreCLR 4.700.20.6603, CoreFX 4.700.20.6701), X64 RyuJIT
LongRun : .NET Core 3.0.3 (CoreCLR 4.700.20.6603, CoreFX 4.700.20.6701), X64 RyuJIT

Job=LongRun IterationCount=100 LaunchCount=3
WarmupCount=15

AUTH BEFORE PERF CHANGES

Method Median P90 P95 P100 Gen 0 Gen 1 Gen 2 Allocated
CreateItem 839.9 us 1,299.0 us 1,579.2 us 2,028.6 us - - - 40.55 KB
DeleteItemExists 837.2 us 1,579.1 us 1,780.3 us 2,267.2 us - - - 40.63 KB
DeleteItemNotExists 860.9 us 1,248.8 us 1,347.8 us 2,112.5 us - - - 57.58 KB
ReadFeed 898.3 us 1,805.2 us 2,096.3 us 2,539.3 us - - - 53.62 KB
ReadItemExists 924.4 us 1,813.8 us 2,104.7 us 2,803.6 us - - - 45.63 KB
ReadItemNotExists 917.1 us 1,359.9 us 1,470.6 us 1,954.0 us - - - 62.56 KB
UpdateItem 762.6 us 1,183.4 us 1,532.9 us 1,958.7 us - - - 40.87 KB
UpsertItem 806.9 us 1,184.7 us 1,286.7 us 1,813.2 us - - - 40.71 KB

AUTH AFTER PERF CHANGES

Method Median P90 P95 P100 Gen 0 Gen 1 Gen 2 Allocated
CreateItem 1,305.8 us 2,676.5 us 3,137.0 us 4,100.0 us - - - 44.29 KB
DeleteItemExists 1,404.5 us 2,970.9 us 3,461.6 us 4,130.9 us - - - 44.5 KB
DeleteItemNotExists 968.8 us 1,857.0 us 2,157.2 us 2,761.7 us - - - 61.46 KB
ReadFeed 962.0 us 1,838.8 us 2,047.7 us 2,638.8 us - - - 57.48 KB
ReadItemExists 829.8 us 1,862.6 us 2,119.4 us 2,532.4 us - - - 49.45 KB
ReadItemNotExists 914.2 us 1,428.9 us 1,768.4 us 2,422.3 us - - - 66.41 KB
UpdateItem 793.4 us 1,332.0 us 1,565.9 us 2,167.4 us - - - 44.7 KB
UpsertItem 759.5 us 1,243.7 us 1,401.4 us 1,745.7 us - - - 44.55 KB

@kirankumarkolli
Copy link
Member Author

@j82w can you please take a look at the PR and see if benchmark changes are right or not.

@visridha any idea why? Also thoughts on how to measure the impact of these changes?

@visridha
Copy link
Contributor

Hey Kiran - I am unable to reproduce the same difference - can we chat and figure out what the differences might be with what i'm seeing?

@visridha
Copy link
Contributor

Latest run:

Method Mean Error StdDev Q3 P80 P85 P90 P95 P100 Op/s Completed Work Items Lock Contentions Gen 0 Gen 1 Gen 2 Allocated
CreateSignatureGeneration 3.219 us 0.0883 us 0.2604 us 3.376 us 3.428 us 3.461 us 3.585 us 3.694 us 3.858 us 310,645.4 0.0000 - 0.0534 - - 896 B
ReadSignatureGeneration 3.192 us 0.0820 us 0.2419 us 3.392 us 3.427 us 3.452 us 3.540 us 3.582 us 3.797 us 313,263.5 0.0000 - 0.0534 - - 896 B

@kirankumarkolli kirankumarkolli force-pushed the users/kirankk/auth_micro_benchmark branch from 8d22dc7 to 28f5cc2 Compare July 28, 2020 12:56
@kirankumarkolli kirankumarkolli changed the title [Internal] MicroBenchmark: Adds Include auth into micro benchmark Performance: Fixes Reduces AuthorizationHelper memory allocations Jul 28, 2020
@kirankumarkolli kirankumarkolli merged commit f01974e into master Jul 28, 2020
@kirankumarkolli kirankumarkolli deleted the users/kirankk/auth_micro_benchmark branch July 28, 2020 15:33
@ghost
Copy link

ghost commented Dec 15, 2021

Closing due to in-activity, pease feel free to re-open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants