-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow change minTopNThreshold per topN query #2221
Conversation
@binlijin can we add a test? |
@fjy, ok, also need to update the doc. |
@binlijin BTW, for your performance issues with topNs over thetasketches, I think you can look into increasing the processing buffersize to avoid doing multiple passes |
and also smaller buckets if possible |
@fjy , we use configuration almost get from http://druid.io/docs/latest/configuration/production-cluster.html, and set druid.processing.buffer.sizeBytes=1073741824. |
@binlijin you may want to increasing the processing size to 2G, because if your theta sketches are very large, perhaps many passes are being done |
Yes, increasing the processing size will do some help. |
@binlijin any chance we can add some docs, tests, and finish this one up? |
@fjy,done |
👍 |
👍, please squash the commits. |
@nishantmonu51 I think the commits are pretty distinct and don't need to be squashed |
Allow change minTopNThreshold per topN query
@fjy: I don't think so, test and doc changes are NOT distinct from the implementation at all and should go in a single commit whenever possible, this makes it easier to backport/revert related changes whenever needed. In case someone needs to revert a feature, docs for that feature should always be reverted along with them and having them in a single commit makes it easier. having separate commits makes it easier to revert one and not the other. any specific reasons you think code changes and related tests/doc changes need to go as separate commits ? |
Current druid.query.topN.minTopNThreshold is configured per server, and need to restart servers to take effect when changed, this patch is to make this configuration can change per query and do not need to restart servers to take effect.