-
-
Notifications
You must be signed in to change notification settings - Fork 400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: parse origins of ingredients field #6995
Conversation
|
||
=head4 percent regulart expression $percent_regexp | ||
|
||
Used to find % values, language specific. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used to find % values, language specific. | |
Used to find % values, language specific. It MUST capture the percent digits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's not too much I would like a refactor to avoid duplicating.
if ($product_lc eq "en") { | ||
|
||
# Origin of the milk: United Kingdom. | ||
if ($text =~ /\s*(?:origin of (?:the )?)([^,.;:]+)(?::| )+([^,.;]+?)\s*(?:;|\.| - |$)/i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking how to avoid repeating that stuff twice in the code. The risk is that we correct it on some place but not the other.
Why don't we put this kind of simple condition in a function. It could return undef if no match, else a result as a hash map.
If all conditions are treated the same you can continue with next function while your result in undefined, and unpack the result at the end. I think it's easier to read, we could even just list the functions and loop (adding percent expression when needed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The very least we can do is add those regular expressions as constants or as functions that return them (for the one which needs percent expression).
Also @stephanegigandet what about using more verbose regexp in our code: https://perldoc.perl.org/perlre#/x-and-/xx This is very handy to comment what you are doing in an expression. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge and refactor later on as proposed by @stephanegigandet
Yes, we should do that, definitely. |
Kudos, SonarCloud Quality Gate passed! |
Fixes partially #4461 (we could add support for more ways to describe ingredients, like "Strawberries from Spain").
Needed in particular for one producer who sent us data in the origins of ingredients field, in order to have it used for the Eco-Score computations.