-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing Coordinated omission + replacing histogram with HdrHistogram #214
Conversation
throttle method in client.
Add option to randomize delay(default=true, as before)
Remove needless synchronization by using CHM in Measurements.
Add option for controlling the feature(default on).
Add a joined histogram measuremement to compare old/new histogram outputs.
- Output old histogram or HdrHistogram or both - Potentially output HdrHistogram to log file on status interval.
- Make status interval configurable
This is a great addition to YCSB. The usefulness of the current Histogram implementation is dubious. When I was evaluating performance values, I would only rarely look at the histogram results. This would require some adjustments and re-runs in order to capture the correct "window" of data. This PR sounds like it would solve that problem and help provide more insight into the performance results. Thanks! |
Glad you approve, not sure what you mean by "window" of data? is that the bucket sizing for original histogram? |
The bucket sizing, exactly. We would often find that the histogram data would not be properly represented because we had a bulk of data ending up in the last "overflow" bucket. This would change based on our test parameters, specifically when we changed threadcount. We would then try re-runs with the number of buckets large enough to fit the current run and then wondered if we were adding undue complexity to the client, thus skewing the results. Needless to say, we just did this a few times, enough to confirm expected results, and then gave up on it. |
HdrHistogram completely solves that problem, as well as a few others :)
The bucket sizing, exactly. We would often find that the histogram data would not be properly represented because we had a bulk of data ending up in the last "overflow" bucket. This would change based on our test parameters, specifically when we changed threadcount. We would then try re-runs with the number of buckets large enough to fit the current run and then wondered if we were adding undue complexity to the client, thus skewing the results. Needless to say, we just did this a few times, enough to confirm expected results, and then gave up on it.— |
Some feedback:
Am I reading this correctly that this defaults to the same latency measuring behavior as before? |
It's been so long I can't remember why I added the dependency packaging... I'll look into it and you can exclude it from the merge if you don't think it's important. If you think it's a good idea maybe open a separate issue for it and make this PR fix it. It does fix more than just one isolated thing and I realize not everyone likes to work that way. |
OK, having looked at the changes I think my change was the simplest way to deliver the dependency to the release package. The assembly configuration picks up the jars from the target folders of all modules and does not bring in modules dependencies. Removing the all in one change will break the HdrHistogram dependency for the end package. |
Please note that given the slow response to PRs on this repository I have joined efforts with DB vendors and am trying to encourage collaboration on a fork: |
The problem with grabbing top level dependencies is that not all of our modules use the same versions of things. Instead, we should update the distribution module to pull in the core dependencies and add them to the classpath. |
Are you sure this is something that needs to be done as part of this PR? I'm not sure this is the place to enforce new policy for new problems. Alternatively you can manually merge and fix the issue as you see fit? I'm saying this not to spite, but as a maintainer of an open-source library. |
Is there any particular action pending on this merge? I see the dependency issue as a continuation of #250 but if that is not the case I'm not sure what you need from me to resolve this PR |
Here are the outstanding review feedback pieces, the first two are blockers:
I've been trying to find someone to pick things up, since your previous comment indicated that you didn't have time to take care of the needed changes. If you do now, that'd be great. Eventually I'll end up taking care of things myself, but my availability to handle PRs that need work is limited. |
|
That plan sounds great. I'm hesitant to make the latency change the default until we've had some time to have folks knock it around (so maybe we can plan for making it the default in 0.3.0 or 0.4.0). The rest sounds great. Let me know how things are going. I'm budgeting some dedicated time this weekend and need to decide if I can spend it on #250. |
Did not end up having time for it. Life got in the way... |
The 0.2.0 RC will branch on June 15th. (The next release will branch on July 15th).
You can ignore this one and forget about the dependency thing. I got far enough on #250 to see that it's probably only luck that made the jackson dependency work before. |
Adding the changes notes here:
To provide a minimal correction patch the following were implemented:
This branch supports the following new options:
Example parameters: Further changes made:
Further suggestions:
|
[core] use HdrHistogram and fix coordinated omission also closes #10
I merged this manually in 4d68068. not sure why it didn't show up. Filed a follow on to clean up the CHANGES.md stuff. Thanks for working on this, it's a great addition. |
AWESOME!!!! |
👍 - thanks! |
[core] use HdrHistogram and fix coordinated omission also closes brianfrankcooper#10
[core] use HdrHistogram and fix coordinated omission also closes brianfrankcooper#10
Hi,
I'm submitting this PR as a halfway point to an agreed upon PR. There are a few changes, some are cosmetic (which I should have perhaps avoided in the name of having a cleaner PR for you to look at), but the 2 items listed in the title are of importance:
Your feedback is most welcome.
Thanks,
Nitsan
PS: I would like to cover this PR in a blog post in the near future and would value your comments on the draft as well
[1] https://groups.google.com/forum/#!msg/mechanical-sympathy/icNZJejUHfE/BfDekfBEs_sJ
[2] http://www.infoq.com/presentations/latency-pitfalls
[3] https://www.youtube.com/watch?v=9MKY4KypBzg
[4] https://github.com/HdrHistogram/HdrHistogram
[5] http://psy-lob-saw.blogspot.com/2015/02/hdrhistogram-better-latency-capture.html