Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add sorted data to Histogram #197

Merged
merged 1 commit into from
Mar 23, 2023

Conversation

FengLiMS
Copy link
Contributor

@FengLiMS FengLiMS commented Mar 20, 2023

add sorted data to Histogram to reduce time spend in creating latency histogram.

sample:

C:\test\diskspd.exe -L -b4k -o1024 -t4 -d1800 -Suw #1  #2  #3  #4  #5  #6  #7  #8  #9  #10 #11 #12 #13 #14 #15 #16 #17 #18 #19 #20 #21 #22

 

diskspd 2.1: 3493 seconds   ( including 1800 sec test time) for a total 432982069 read, using max 22GB ram

with update:
diskspd new: 2432 seconds ( including 1800 sec test time) for a total 465295722 read, using max 12GB ram

@FengLiMS FengLiMS marked this pull request as ready for review March 20, 2023 22:55
@FengLiMS FengLiMS changed the title add sorted data to Historgram add sorted data to Histogram Mar 21, 2023
@dl2n
Copy link
Member

dl2n commented Mar 23, 2023

Thanks Feng. I'll iterate a bit more on this along the lines we spoke about. I think we can simplify a bit by assuming that once we begin asking q's of the data (mean, percentiles, etc.) we should always sort-and-seal, breaking the seal if more data is added (add/merge). I may also see about a single-sweep percentile gatherer in hopes of removing even more time.

@dl2n dl2n merged commit 3862695 into microsoft:dev Mar 23, 2023
dl2n added a commit that referenced this pull request Jun 13, 2024
* add sorted data to Historgram (#197)

Co-authored-by: Feng Li <fengli@microsoft.com>

* multiple updates from msft internal
2.1.2 prerelease

flush stdout to force XML/text results through prior to process exit
fix issues identified by vs2019 build loop - 64->32 downcast, minor printf issues, _snprintf -> _snprintf_s
timestamp prefixes for all verbose output
dump specific time intervals for warmup/measured/cooldown to validate v. expected @ actual time of use/measurement
only emit warmup/cooldown verbose output notes if there is nonzero warmup/cooldown specified in the profile
minor cleanup of load thread verbose output so that everything has a consistent thread N: prefix
histogram processing speedups, about 60% for XML (all percentiles)
implicitly seal/sort histogram on first read operation, reset on subsequent inserts; save repeated re-sorting
save histogram percentile iterator so that asecending percentile queries do not restart from lowest sample
min/max in terms of begin/rbegin iterator
show latency histogram bucket counts in XML results; potential work to moderate histogram size
unit test coverage

* DISKSPD 2.2

---------

Co-authored-by: FengLiMS <128095530+FengLiMS@users.noreply.github.com>
Co-authored-by: Feng Li <fengli@microsoft.com>
Co-authored-by: Dan Lovinger <danlo@ntdev.microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants