-
Notifications
You must be signed in to change notification settings - Fork 674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SOLR-17477: Support custom aggregates in plugin code #2742
base: main
Are you sure you want to change the base?
Conversation
I like it. Thanks @timatbw. I'm able to merge, if there's an approve from the second 👀. |
solr/core/src/java/org/apache/solr/search/facet/DocValuesAcc.java
Outdated
Show resolved
Hide resolved
solr/core/src/java/org/apache/solr/search/facet/FacetModule.java
Outdated
Show resolved
Hide resolved
REGISTERED_TYPES.put("func", (p, k, a) -> p.parseStat(k, a)); | ||
} | ||
|
||
public static void registerParseHandler(String type, ParseHandler parseHandler) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the https://issues.apache.org/jira/browse/SOLR-17477 comment w.r.t. how this is called!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure where to discuss it. But registering anything in static initializer is asking for trouble, imho. I'd rather hookup a component via solrconfig.xml (or a plugin??), which just calls registerParserHandler()
for custom handlers.
Anyway it's a matter of taste, and should block this PR, unless someone propose to make I tried, it was so pity, don't even show it to anyone.FP.registerParserHandler
protected
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's perhaps not the best way of hooking things up, especially as the whole process is already kicked off from solrconfig.xml where you declare a valueSourceParser. I was relying on the point at which that custom parser object is instantiated to run a static initialiser to also register it for json parsing too. But maybe it's safer to invert that and have the Solr code register the custom parser if it implements json parsing. I'll have another look and think about the alternatives.
My main reason for suggesting custom code uses a static block to register is when you have many hundreds of SolrCores in your CoreContainer and each one would be instantiated and call the register method, when really you might only need it done once. But now thinking if you run a mixed workload with different solrconfig and they're not homogenous, you don't want static registering maybe (each SolrCore is distinct)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although the 'standard set' of registered types in the static block above probably is appropriate as those are built-in and common to all SolrCores.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
REGISTERED_TYPES
are shared across cores. It means:
- one core might inject a custom parser, it appears across all cores. If it hooks up some large state it remains in heap after core unloading (it might be called as a leak)
registerPH
might be called with one of the standard name bringing some surprises to other cores.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes all good points, I'm going to change this to remove the statics and make it per-core which probably fits better with how it's hooked in too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please proceed with access modifiers
@@ -138,21 +139,30 @@ public Object parseFacetOrStat(String key, Object o) throws SyntaxError { | |||
return parseFacetOrStat(key, type, args); | |||
} | |||
|
|||
public interface ParseHandler { | |||
Object doParse(FacetParser<?> parent, String key, Object args) throws SyntaxError; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, is there an other exemplar of such a naming convention like doSometh
? why not just parse
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No particular strong feelings on this, I tend to use doSomething
when it's an inner helper method called from an outer/wrapper method that is the something(..)
but in this case it's not really directly called. In practice I think people would use lambda anyway and never actually write this method name. But happy to change it to parse
!
@@ -138,21 +139,30 @@ public Object parseFacetOrStat(String key, Object o) throws SyntaxError { | |||
return parseFacetOrStat(key, type, args); | |||
} | |||
|
|||
public interface ParseHandler { | |||
Object doParse(FacetParser<?> parent, String key, Object args) throws SyntaxError; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, return type is a little bit tricky FacetRequest|AggValueSource
I suppose it deserves to be documented via Javadoc, unless we can declare it explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already how it is, the calling code allows the parser to return either a FacetRequest or a AggValueSource hence that Object type was needed here too. It's not ideal, and perhaps in practice most people will want to produce a AggValueSource only. Creating new subclasses of FacetRequest outside the package is harder, and I've not yet looked at how feasible it is to do that from a plugin (it's not the focus of this issue). But it would be interesting if eventually you can define custom types of faceting too.
oh gosh.. I figure it out how to update PR as a maintainer. The problem was I did it by https auth from Idea, but it seems it requested token only for my account and can't push into someone else's even if |
I've just realised the registration part of this work is not necessary! All that's really needed is the public modifier changes. I'm going to simplify the PR down to just that bit. The reason the registration isn't needed is because you can already achieve the desired results using the existing mechanism
then all those key-value params are available inside your implementation of
|
I'm going to (force) push my local branch to this PR again, which will overwrite/replace the commits you've added on top @mkhludnev just so you know. But I think the previous approach we were working on isn't necessary so I'm simplifying this PR a lot |
0e2d148
to
0502e60
Compare
I'm made a simple experiment. |
Good points, yes I did notice that the parent parser would not be available via the simple localParams approach, but I didn't realise that SolrParams only support String values not arbitrary nested objects (that's a shame). For my use-cases, I'm OK with just the String-typed map and no need for the parent parser, but I guess a more general solution could support those as well. The solution I had been working on before I found |
I'm not sure it may work, since ValueSourceParser is invoked by FuncQParser with fixed interface. |
Here's my idea:
Then, in another PR we may choose:
@timatbw, wdyt? |
Yes that sounds good, as I feel we might be trying to expand the scope of this change too much for 1 PR. I agree, making the public modifiers change is needed, then we can document how to do a typical simple use-case of a custom aggregate in plugin code using only String values from json. That's enough for this PR. I'll work on that next, although I don't know much about the RefGuide as I've not written for it before, will look. And yes, next steps of work for another PR would be to add another method to ValueSourceParser and call it from FacetParser, so we fully support parent facets and non-String complex json config. I can produce a 2nd PR with my proposed diff for that change. And further steps are your Plan B, with complete ability to override or replace the FacetModule mechanism, and define new kinds of faceting totally different to the Facet Field/Range/Query etc. |
Wait a sec. I read here https://solr.apache.org/guide/solr/latest/query-guide/json-facet-api.html#stat-facet-functions
Can the proposed approach with |
Correct, the work in this Jira+PR is only about custom metrics i.e. calculating a value for each bucket. It's not about creating custom faceting types for making buckets in the first place. That would be interesting too, but is quite a different bit of work because the FacetModule is not currently set up to do that (a lot more classes would need to be open for extension). |
Just got some fun of it #2865 I don't recommend anyone to look at it. It's so scary. |
made a short clue mkhludnev@c1c1a3e |
does it mean |
The PR now only changes that class to make it public but that was probably only needed in the original changes around registering parsers, so yes it may not need to be changed at all now. |
link to the relevant discussion https://lists.apache.org/thread/bk15vv82xtgjslvpp5ff3nctop7qcy6j |
https://issues.apache.org/jira/browse/SOLR-17477
Description
Better support for writing custom aggregates in 3rd party plugin code, outside of Solr.
Solution
Opened up visibility of certain classes that are necessary to write custom aggregates, and allow them to register by type so that they are supported just the same as field/query/range/heatmap etc
Tests
Adjusted existing test that confirms a custom aggregate can be written outside the solr facet package
Checklist
Please review the following and check all that apply:
main
branch../gradlew check
.