-
-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quality facet - "quantity-contains-e" has many false positives #2037
Comments
I agree that the regexp is broken, but the proposed change also delete true positives. What we really want to identify is cases like this: 500 ge |
In the first place, why do we want to identify quantities containing "e"? @stephanegigandet |
See: (https://world.openfoodfacts.org/quality/quantity-contains-e) -> https://world.openfoodfacts.org/data-quality-info/en:quantity-contains-e For the aforementioned example, see also from the API: https://world.openfoodfacts.org/api/v2/search?code=3464660002434 The aforementioned line of code can now be found in /lib/ProductOpener/SiteQuality_off.pm Finally, to find more examples: |
That's a good question. :) (sorry @VaiTon I missed it 4 years ago). The symbol is present on most products, I didn't see the point in including it in the quantity field. |
Should we close this issue @CharlesNepote? |
I think we can close. |
What
See: https://world.openfoodfacts.org/quality/quantity-contains-e
Example: "1 litre" in https://world.openfoodfacts.org/product/3464660002434/jus-d-aloe-vera-pur-aloe
The regexp recognize too much things:
openfoodfacts-server/lib/ProductOpener/SiteQuality_off.pm
Line 747 in aa25a7e
I replaced:
/(?:.*e$)|(?:[0-9]+\s*[kmc]?[gl]?\s*e)/i
by/(?:^e\s)|(?:\se[^a-z])|(?:\se$)/i
And it deleted more than 1500 false positives (see file attached).
4.txt
Part of
The text was updated successfully, but these errors were encountered: