-
Notifications
You must be signed in to change notification settings - Fork 618
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Time taken for "top" listing for a large (34MB) profile drops by 15%: ``` name old time/op new time/op delta Top-12 13.2s ± 3% 11.2s ± 2% -14.72% (p=0.008 n=5+5) ``` Furthermore, the time taken to merge/diff 34MB profiles drops by 53%: ``` Merge/2-12 7.74s ± 2% 3.63s ± 2% -53.09% (p=0.008 n=5+5) ``` Details follow: The cost of a trivial merge was very high (4s for 34MB profile). We now just skip such a merge and save the 4s. * Only create a Sample the first time a sample key is seen. * Faster ID to *Location mapping by creating a dense array that handles small IDs (this is almost always true). * Faster sampleKey generation during merging by emitting binary encoding of numbers and using a strings.Builder instead of repeated fmt.Sprintf. The preceding changes drop the cost of merging two copies of the same 34MB profile by 53%: ``` name old time/op new time/op delta Merge/2-12 7.74s ± 2% 3.63s ± 2% -53.09% (p=0.008 n=5+5) ``` * Use temporary storage when decoding to reduce allocations. * Pre-allocate space for all locations in one shot when creating a Profile. The preceding speed up decoding by 13% and encoding by 7%: ``` name old time/op new time/op delta Parse-12 2.00s ± 4% 1.74s ± 3% -12.99% (p=0.008 n=5+5) Write-12 679ms ± 2% 629ms ± 1% -7.44% (p=0.008 n=5+5) ``` When used in interactive mode, each command needs to make a fresh copy of the profile since a command may mutate the profile. This used to be done by serializing/compressing/decompressing/deserializing the profile per command. We now store the original data in serialized uncompressed form so that we just need to deserialize the profile per command. This change can be seen in the improvement in the time needed to generate the "top" output: ``` name old time/op new time/op delta Top-12 13.2s ± 3% 12.4s ± 0% -5.84% (p=0.008 n=5+5) ``` * Avoid filtering cost when there are no filters to apply. * Avoid location munging when there are no tag roots or leaves to add. * Faster stack entry pruning by caching the result of demangling and regexp matching for a given function name. ``` name old time/op new time/op delta Top-12 13.2s ± 3% 12.3s ± 2% -6.33% (p=0.008 n=5+5) ``` * Added benchmarks for profile parsing, serializing, merging. * Added benchmarks for a few web interface endpoints. * Added a large profile (1.2MB) to proftest/testdata. This profile is from a synthetic program that contains ~23K functions that are exercised by a combination of stack traces so that we end up with a larger profile than typical. Note that the benchmarks above are from an even larger profile (34MB) from a real system, but that profile is too big to be added to the repository.
- Loading branch information
Showing
17 changed files
with
447 additions
and
95 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
# Description: | ||
# Auto-imported from github.com/google/pprof/internal/proftest | ||
|
||
licenses(["notice"]) | ||
|
||
package( | ||
default_applicable_licenses = ["//third_party/golang/pprof:license"], | ||
default_visibility = ["//third_party/golang/pprof/internal:friends"], | ||
) | ||
|
||
go_library( | ||
name = "proftest", | ||
srcs = ["proftest.go"], | ||
embedsrcs = ["testdata/large.cpu"], | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
Oops, something went wrong.