Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Localize Pluralize/Singularize (WAS: Localizable InflectorExtensions) #197

Open
kblok opened this issue Apr 12, 2014 · 14 comments
Open

Localize Pluralize/Singularize (WAS: Localizable InflectorExtensions) #197

kblok opened this issue Apr 12, 2014 · 14 comments

Comments

@kblok
Copy link
Contributor

kblok commented Apr 12, 2014

I'd like to implement an Spanish implementation for the InflectorExtensions. I don't know if there is an ongoing work on this topic (the issue #132 is quite related to this)

I think we should have a culture specific provider responsible of filling the rules list and then simply (?) write regex rules for each language.

What do you think @MehdiK ?

@MehdiK
Copy link
Member

MehdiK commented Apr 12, 2014

I think by inflector methods you only mean Pluralize and Singularize here, right?

This is a great idea. We can extract the localizable logic out of the class, implement a default pluralizer/singularizer/inflector class with the current logic (excluding the rules) and provide hooks for injecting the rules etc; kinda like how NumberToWordsConverter is implemented.

What does the localisation committee think? /cc @harouny, @JonasJensen, @mexx, @mnowacki, @hazzik, @thunsaker, @henriksen, @ekblom, @akamud, @ignorkulman, @Borzoo, @onovotny

@kblok
Copy link
Contributor Author

kblok commented Apr 12, 2014

Yes, I'm talking about the Pluralize and Singularize feature.
This is what I have in mind:

  • InflectorExtensions should only have extension methods logic
  • An IInflector which could have the same API as the extension methods
  • As you say we could have a DefaultInflactor with the current behavior but with no explicit rules
  • New classes EnglishInflector and SpanishInflector which could inherit from DefaultInflector and with the responsibility of filling the Plurals, Singulars and Uncountable lists

I think that the Spanish language has similar rules than the english language regarding singularity and plurality (It has uncountable, singular only, plural only and irregular words) so the behavior could be the same.
Another language could choose between inherits from DefaultInflector or just implementing the IInflector interface.

I have my doubts if IInflector, DefaultInflector, EnglishInflector, etc should have some sufix (Provider? Engine?)

@MehdiK
Copy link
Member

MehdiK commented Apr 12, 2014

Before going too far, I'd like to confirm that it is actually possible to implement this logic in other languages too, either through changing the rules or implementation from scratch. Depending on the complexities of other languages we may have to choose a different design or think harder about this. Sometimes language rules get way too complex (#64)! I have considered creating a new Humanizer.Dictionary package that deals with this and other language specific word manipulations, and I still think that's a viable solution.

FWIW the English implementation is relatively buggy too. See #142 for more details.

@akamud
Copy link
Contributor

akamud commented Apr 15, 2014

After looking at the InflectorExtension implementation I can say this implementation would work with Portuguese. The "normal" rules aren't too complex.
The problem with the plural for portuguese is that, although it may look simple, its exception rules depend on Etymology or word's accent. Making it impossible to predict what the correct plural form would be.

For example, there's a rule that says that words that end with "ão" will have "ões" in its plural form:

coração -> corações
cordão -> cordões

But there are some words that don't follow that rule:

órgão -> órgãos
alemão -> alemães
cão - cães

In some cases this rule changes because the accentuated syllable is not the last one. But some words won't even follow this rule (and as far as I know, there is no rule for these kind of words):

mão -> mãos
artesão -> artesãos

To ensure a more accurate translation we will indeed need a dictionary. Probably something similar happens in English and Spanish.

@thunsaker
Copy link
Contributor

Spanish rules are similar, I tried to explain some of these with regard to the ordinals #212

On Mon, Apr 14, 2014 at 6:29 PM, Mahmoud Ali notifications@github.com
wrote:

After looking at the InflectorExtension implementation I can say this implementation would work with Portuguese. The "normal" rules aren't too complex.
The problem with the plural for portuguese is that, although it may look simple, its exception rules depend on Etimology or word's accent. Making it impossible to predict what the correct plural form would be.
For example, there's a rule that says that words that ends with "ão" will have "ões" in its plural form:

coração -> corações
cordão -> cordões

But there are some words that don't follow that rule:

órgão -> órgãos
alemão -> alemães
cão - cães

In some cases this rule changes because the accentuated syllable is not the last one. But some words won't even follow this rule (and as far as I know, there is no rule for these kind of words):

mão -> mãos
artesão -> artesãos

To ensure a more accurate translation we will indeed need a dictionary. Probably something similar happens in English and Spanish.

Reply to this email directly or view it on GitHub:
#197 (comment)

@kblok
Copy link
Contributor Author

kblok commented Apr 15, 2014

My concern with dictionaries is the impact they could have in terms of the "weight" of the library (I think it could be solved with resources) and performance (I should also be worried about the performance with so many regex the lib is evaluating right now).

Another think with dictionary is maintenance, where will we easily get a list of singular and plurals? I don't know if it easy to get, at least for the Spanish language.

@kblok
Copy link
Contributor Author

kblok commented Apr 15, 2014

BTW @thunsaker I have this link with rules for plurals (spanish)
http://es.m.wikibooks.org/wiki/Espa%C3%B1ol/Morfolog%C3%ADa/Sustantivo

@mexx
Copy link
Collaborator

mexx commented Apr 16, 2014

For Russian there is an extra grammatical number present. In the current implementation it is named Paucal, actually it is a kind of Dual. For now I have no elegant solution to support this distinction in the Inflector scenario.

In German it would be possible to go with the injection of the rules, as German as English also have only two grammatical numbers.

@hazzik
Copy link
Member

hazzik commented Apr 16, 2014

@mexx, paucal is usually not a number, but a genitive case in Russian.

@hazzik
Copy link
Member

hazzik commented Apr 16, 2014

I think we need to properly implement GrammaticalNumberDetector for all languages and widely use it.

@MehdiK MehdiK changed the title Localizable InflectorExtensions Localize Pluralize/Singularize (WAS: Localizable InflectorExtensions) May 27, 2014
@hazzik
Copy link
Member

hazzik commented Jun 28, 2014

I'm thinking about interface IQuantifiable { ToQuantity(int number); } or IWord, which can implement language specific logic of quantification. What do you think? The concept similar to this was already used in DutchNumberToWordsConverter

@Borzoo
Copy link
Contributor

Borzoo commented Aug 8, 2014

@hazzik, this idea was implemented in #285 but we need a better design to convert singulars to plurals and duals and vice versa. I'm trying to come up with an elegant solution that supports singulars, duals, paucals(if needed) and plurals.

@hazzik
Copy link
Member

hazzik commented Aug 11, 2014

@Borzoo, the thing implemented in #285 is something different. There is IQuantifier, which can quantify any word, but I propose that word itself can have different representations.

This was referenced Nov 3, 2020
@5cover
Copy link

5cover commented May 11, 2023

Has any progress been made on this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants