-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Case sensitive #3
Comments
In german there are words with different meaning when written upper- or lowercase (not many, only a few) Example: Rasen = grass |
yeah, that's true :S. How hard would it be to adapt the files for german and build the plugin ourselves? |
Just fork and feel free to modify https://github.com/jprante/elasticsearch-analysis-baseform/tree/master/src/main/resources to your requirements ;-) N.B. for lowercasing (with some ambiguities), you could simply combine this baseform analyzer with a lowercase filter. |
Actually, our problem is that users might enter the searchstring all lowercase and that it then cannot convert it into its base form. The second problem is that we use this plugin in combination with the decompound plugin which returns the tokens in lowercase and we have cases, where for some reason it does not return the tokens in their base form. E.g. Fleischtomaten converts into fleisch and tomate, but Datteltomaten converts to dattel and tomateN and the baseform plugin can then not convert tomaten into its base form because it's lowercase. |
I'm using this plugin for german text and it seems that it's case sensitive. Is that the case? If yes, what's the reason for that?
The text was updated successfully, but these errors were encountered: