Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow change minTopNThreshold per topN query #2221

Merged
merged 3 commits into from
Jan 13, 2016
Merged

Allow change minTopNThreshold per topN query #2221

merged 3 commits into from
Jan 13, 2016

Conversation

binlijin
Copy link
Contributor

@binlijin binlijin commented Jan 7, 2016

Current druid.query.topN.minTopNThreshold is configured per server, and need to restart servers to take effect when changed, this patch is to make this configuration can change per query and do not need to restart servers to take effect.

@fjy
Copy link
Contributor

fjy commented Jan 7, 2016

@binlijin can we add a test?

@binlijin
Copy link
Contributor Author

binlijin commented Jan 7, 2016

@fjy, ok, also need to update the doc.

@fjy
Copy link
Contributor

fjy commented Jan 7, 2016

@binlijin BTW, for your performance issues with topNs over thetasketches, I think you can look into increasing the processing buffersize to avoid doing multiple passes

@fjy
Copy link
Contributor

fjy commented Jan 7, 2016

and also smaller buckets if possible

@binlijin
Copy link
Contributor Author

binlijin commented Jan 7, 2016

@fjy , we use configuration almost get from http://druid.io/docs/latest/configuration/production-cluster.html, and set druid.processing.buffer.sizeBytes=1073741824.
Smaller buckets is a possible way, also we find a minor optimization to topN, will later post the PR.

@fjy
Copy link
Contributor

fjy commented Jan 7, 2016

@binlijin you may want to increasing the processing size to 2G, because if your theta sketches are very large, perhaps many passes are being done

@binlijin
Copy link
Contributor Author

binlijin commented Jan 7, 2016

Yes, increasing the processing size will do some help.

@fjy
Copy link
Contributor

fjy commented Jan 11, 2016

@binlijin any chance we can add some docs, tests, and finish this one up?

@binlijin
Copy link
Contributor Author

@fjy,done

@fjy
Copy link
Contributor

fjy commented Jan 12, 2016

👍

@nishantmonu51
Copy link
Member

👍, please squash the commits.

@fjy
Copy link
Contributor

fjy commented Jan 13, 2016

@nishantmonu51 I think the commits are pretty distinct and don't need to be squashed

fjy added a commit that referenced this pull request Jan 13, 2016
Allow change minTopNThreshold per topN query
@fjy fjy merged commit d7ad93d into apache:master Jan 13, 2016
@nishantmonu51
Copy link
Member

@fjy: I don't think so, test and doc changes are NOT distinct from the implementation at all and should go in a single commit whenever possible, this makes it easier to backport/revert related changes whenever needed. In case someone needs to revert a feature, docs for that feature should always be reverted along with them and having them in a single commit makes it easier. having separate commits makes it easier to revert one and not the other.

any specific reasons you think code changes and related tests/doc changes need to go as separate commits ?

@binlijin binlijin deleted the topN_minTopNThreshold branch January 21, 2016 01:13
@fjy fjy added this to the 0.9.0 milestone Feb 4, 2016
@fjy fjy mentioned this pull request Feb 5, 2016
@fjy fjy added the Feature label Feb 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants