Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle NPE due to a weird configuration #48007

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -727,6 +727,10 @@ private PriorityQueue<ScoreTerm> createQueue(Map<String, Int> words, String... f
int numDocs = ir.numDocs();
final int limit = Math.min(maxQueryTerms, words.size());
FreqQ queue = new FreqQ(limit); // will order words by score

if (limit == 0) {
return queue;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than checking for a particular value this late, I'd rather try improving the input validation for the max_query_terms parameter, both in the MoreLikeThisQueryBuilder setter and also maybe in MoreLikeThisQuery. I think we should reject negative or zero values here, since at least to my understanding it indicates a wrong usage of the API. Or are there cases you know where this parameter setting would make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree there is no sense to set max_query_terms = 0, maybe the API should validate this parameter instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you interested in changing this PR to add validation to the MoreLikeThisQueryBuilder and MoreLikeThisQuery setters for the max query terms? I think we should reject all values <= 0 there already with an IllegalArgumentException and a fitting message. Adding these checkes should also be tests in the corresponding unit tests. If you don't want to do this just let me know, I'll open an issue so we can fix it later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! thanks you


for (String word : words.keySet()) { // for every word
int tf = words.get(word).x; // term freq in the source doc
Expand Down