Disable support for simple singularization #868

kendallb · 2019-10-25T21:59:03Z

We are currently using this library to help with product search on our web site, and we take all the keywords we index against and get singular and plural versions of the words to add to the index. Unfortunately the singular algorithm has the simplest rule at the start, which is anything ending in an 's' should simply have the 's' removed. While this normally makes sense, in our case it just introduces a lot of noise because our search indexing will always happily match on a partial work. So if we have 'tires' in our search index, if someone enter 'tire' or 'tires' they will both match so adding the singular for 'tire' into the index is not necessary.

Stuff does sideways with this simple algorithm because then it starts to convert things like brand names to singular like 'Traxxas' becomes 'Traxxa' but that is not a real word.

The simple solution is to just remove the first element in the rule list, but there is no way to do that using the stock library as I cannot modify the internal rule list, nor can I replace the vocabulary with my own (I can't create a Vocabulary class as it's internal).

For now I plan to simply fork the library and hack it out so I can do what I need, but what I would prefer to do is modify the library so I can adjust the way it works for my needs and get that accepted upstream so I don't need to maintain my own library.

I see there are a couple of ways to do this:

Add a function to be able to remove a rule from the default vocabulary
Add a parameter on the Singular function to tell it to ignore the simple first rule
Allow me to create my own vocabulary (make the class constructor not internal)

I am not sure that 1 or 3 are really good long term solutions as then I am changing the default vocabulary and if someone else in our team uses it for some other purpose in the same context, they might get unexpected results. So I am leaning towards doing #2.

Thoughts?

kendallb mentioned this issue Oct 25, 2019

Added support for ignoring singularization on simple words #869

Merged

clairernovotny closed this as completed Nov 13, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable support for simple singularization #868

Disable support for simple singularization #868

kendallb commented Oct 25, 2019

Disable support for simple singularization #868

Disable support for simple singularization #868

Comments

kendallb commented Oct 25, 2019