Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: activate Japanese ingredients processing #8621

Merged
merged 5 commits into from
Jun 29, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions lib/ProductOpener/Ingredients.pm
Original file line number Diff line number Diff line change
Expand Up @@ -1804,8 +1804,15 @@ sub parse_ingredients_text ($product_ref) {
)

# match before or after the ingredient, does not require a space
or ( (($product_lc eq 'de') or ($product_lc eq 'nl') or ($product_lc eq 'hu'))
and ($new_ingredient =~ /(^($regexp)|($regexp)$)/i))
or (
(
($product_lc eq 'de')
or ($product_lc eq 'hu')
or ($product_lc eq 'ja')
or ($product_lc eq 'nl')
)
and ($new_ingredient =~ /(^($regexp)|($regexp)$)/i)
)

# match after the ingredient, does not require a space
# match before the ingredient, require a space
Expand Down
2 changes: 1 addition & 1 deletion taxonomies/ingredients.txt
Original file line number Diff line number Diff line change
Expand Up @@ -46291,7 +46291,7 @@ id:Bawang putih
io:Alio
is:Hvítlaukur
it:aglio
ja:ニンニク, にんにく
ja:ニンニク, にんにく, ガーリック
jv:Bawang
ka:ნიორი
kk:Сарымсақ
Expand Down
28 changes: 23 additions & 5 deletions taxonomies/ingredients_processing.txt
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,7 @@ pt:cortado, cortada, cortados, cortadas
en:thick cut
es:cortada gruesa,cortado grueso, cortados gruesos,cortadas gruesas
hu:vastagon vágott, vastag vágás
ja:厚切り
pt:corte crosso, cortado grosso, cortes grossos, cortados grossos

# <en:cut
Expand Down Expand Up @@ -227,6 +228,7 @@ es:lamina,laminada,laminado, laminadas,laminados,rodajas
fr:tranché, tranchée, tranchés, tranchées, en lamelles, en tranches, en lanières, en rondelles, tranches de, lamelles de, lanières de, rondelle de, rondelles de, tranches d', lamelles d', lanières d', rondelles d'
hu:szeletelt, szelt, szeletelve
it:a fette, in fette, affettato, affettata, affettati, affettate
ja:スライス, 薄切り
nl:schijfjesamande émondée
pt:laminado, laminada, laminados, laminadas, em lâminas, em lâmina, em rodela, em rodelas, rodela de, rodelas de

Expand All @@ -241,6 +243,7 @@ fr:hâché, hâchée, hâchés, hâchées, haché, hachée, hachés, hachées
hr:usitnjeni, usitnjena, sjeckani
hu:apróra vágott, aprított, aprítva, apróra vágva
is:hakkaðar
ja:刻み
nb:hakkede
nl:gehakt, gehakte
nn:hakkede
Expand Down Expand Up @@ -303,6 +306,7 @@ fi:raastettu, raastetut
fr:râpé, râpée, râpés, râpées
hu:reszelt, reszelve
it:grattugiato, grattugiata, grattugiati, grattugiate
ja:おろし
nl:geraspt, geraspte
pl:tarty, tarta, tarte
pt:ralado, ralada, ralados, raladas
Expand Down Expand Up @@ -382,6 +386,7 @@ fr:en purée, purée de, purée d'
hr:kaša, kaša od
hu:püré, pürésített
it:purea di
ja:ピュレ, ピューレ, ピューレー
mk:каша од
nb:puré
nl:puree
Expand Down Expand Up @@ -414,6 +419,7 @@ fr:entier, entière, entiers, entières
hr:cijele
hu:egész, egészben
it:intero, intera, interi, intere
ja:全
pl:całe, w całości
pt:inteiro, inteira, inteiros, inteiras
sv:hel, hela
Expand Down Expand Up @@ -464,6 +470,7 @@ fi:hiutaleet, hiutale
fr:flocons de, en flocons
hr:pahuljice od
it:fiocchi di, fiocchi d'
ja:フレーク
nl:vlokken
pl:płatki, grys, wiórki
pt:floco, flocos, em floco, em flocos, floco de, flocos de
Expand All @@ -476,7 +483,7 @@ es:copos de
fi:lastu
fr:chips de
it:fettine di
ja:チップス
ja:チップス, チップ
nl:chips
pl:chipsy

Expand Down Expand Up @@ -518,7 +525,7 @@ id:Bubuk
io:Polvo
is:duft
it:in polvere, en polvere, polvere di, polvere d'
ja:パウダー
ja:パウダー, 粉末, 末
jv:Bubuk
lt:milteliai
lv:pulveris
Expand Down Expand Up @@ -568,7 +575,7 @@ he:קמח אורז
hu:liszt
is:mjöl
it:farina di, farina d'
ja:粉
ja:粉, フラワー
lt:miltai
lv:milti
nb:mel
Expand Down Expand Up @@ -713,6 +720,7 @@ gl:Polpa
he:בקבוקון מיץ
hu:bél, velő
it:polpa, polpa di, polpa d'
ja:パルプ
nl:pulp, vruchtvlees
#nl:false:met pulp
pl:pulpa, pulpa z
Expand Down Expand Up @@ -751,6 +759,7 @@ es:desengrasado, desengrasada,desgrasado,desgrasada
fr:dégraissé, dégraissée, dégraissés, dégraissées
hu:zsírtalanított
it:sgrassato, sgrassata
ja:脱脂
nl:ontvet, ontvette
pl:odtłuszczone, o obniżonej zawartości tłuszczu
pt:desengordurado, desengordurada
Expand Down Expand Up @@ -786,6 +795,7 @@ fr:jus de, jus d'
hr:sok, sok od, voćni sok od, od soka
hu:leve, lé
it:succo, succo di, succo d', succhi, succhi di, succhi d'
ja:ジュース, 果汁
nb:juice
nl:sap
nn:juice
Expand Down Expand Up @@ -859,6 +869,7 @@ fr:concentré, concentrée, concentrés, concentrées, concentré de
hr:koncentriran, koncentrirani, koncentrani, koncentrat, koncentrat za
hu:sűrítmény, sűrített, koncentrátum
it:concentrato, concentrato di, concentrato d', concentrati, concentrata, concentrate
ja:濃縮
ko:농축물
mk:концентриран, концентрирана
nl:concentraten, concentraat, geconcentreerd, geconcentreerde
Expand Down Expand Up @@ -896,6 +907,7 @@ de:200fach konzentriertes
#fr:pâte de, pâte d'
#hr:pasta od
#it:pasta di, pasta d'
#ja:ペースト
#pl:pasta z
#sv:pasta

Expand All @@ -915,6 +927,7 @@ fr:à base de concentré, à partir de concentré
hr:koncentrirane, na koncentrat, od koncentriranog soka škrob, od koncentriranog soka, od koncentrirane kaše
hu:sűrítményből, koncentrátumból, sűrítményekből, koncentrátumokból
it:da concentrato
ja:濃縮還元
lt:iš koncentrąto
nl:uit concentraat, van concentraat, op basis van concentraat
pl:z koncentratu
Expand All @@ -939,7 +952,7 @@ he:מיובשים, מיובש
hr:sušeni, sušena, sušeno, sušene, suha, suhe, suho, suhi
hu:szárított
it:secco, secca, secchi, secche, essiccato, essiccati, essiccata, essiccate
ja:ドライ
ja:ドライ, 乾燥
lt:džiovinti
lv:žāvēti
nb:tørkede
Expand Down Expand Up @@ -1095,7 +1108,7 @@ fr:extrait de, extrait d'
hr:ekstrakt
hu:kivonat, kivonatból
it:estratto di, estratti di, estratto d', estratti d'
ja:エキス, 抽出
ja:エキス, 抽出物
#it:falsePositive:antiossidante estratto ricco di tocoferolo
nb:ekstrakt, ekstrakter
nl:extract
Expand Down Expand Up @@ -1300,6 +1313,7 @@ hr:kuhani
hu:főtt, főzött
id:rebus
it:cotto, cotta, cotti, cotte, cotto al naturale, cotta al naturale, cotti al naturale, cotte al naturale, cucinato, cucinata, cucinati, cucinate
ja:ゆで, 茹で
ms:rebus
nl:gekookt, gekookte
nn:kogte
Expand Down Expand Up @@ -1364,6 +1378,7 @@ fr:frit, frite, frits, frites
hr:pržene
hu:sült
it:fritto, fritta, fritti, fritte
ja:フライド
pl:smażony, smażona, smażone
pt:frito, frita, fritos, fritas
sv:friterad, stekt
Expand All @@ -1380,6 +1395,7 @@ et:röstitud
fr:rôti, rôtie, rôtis, rôties, torréfié, torréfiée, torréfiés, torréfiées
hu:pörkölt
it:arrostito, arrostita, arrostiti, arrostite
ja:ロースト
nl:gebraden
pt:assado, assada, assados, assadas
sk:pražené
Expand Down Expand Up @@ -1558,6 +1574,7 @@ fr:sucré, sucrée, sucrés, sucrées, sucree, sucrees
#better not add fr:sucre and fr:sucres
hu:cukrozott
it:zuccherato, zuccherati, zuccherata, zuccherate
ja:加糖
nl:gesuikerd, gesuikerde
pt:açucarado, açucarada, açucarados, açucaradas
sv:sockrad
Expand Down Expand Up @@ -1752,6 +1769,7 @@ fr:aromatisé, aromatisée, aromatisées
# What to do with aromatisé au x and aromatisé de
hu:ízesített
it:aromatizzato, aromatizzata, aromatizzati, aromatizzate
ja:調味
nl:gearomatiseerd, gearomatiseerde
pl:aromatyzowany, aromatyzowana, aromatyzowane, przyprawiony, przyprawiona, przyprawione
pt:aromatizado, aromatizada, aromatizados, aromatizadas
Expand Down
95 changes: 95 additions & 0 deletions tests/unit/ingredients_processing.t
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,11 @@ use utf8;

use Test::More;

my $builder = Test::More->builder;
binmode $builder->output, ":encoding(utf8)";
binmode $builder->failure_output, ":encoding(utf8)";
binmode $builder->todo_output, ":encoding(utf8)";

#use Log::Any::Adapter 'TAP';
use Log::Any::Adapter 'TAP', filter => 'trace';

Expand Down Expand Up @@ -1647,6 +1652,96 @@ my @tests = (
]
],

##################################################################
#
# JAPANESE ( JA )
#
##################################################################

[
{
lc => "ja",
ingredients_text =>
# sliced
"スライスアーモンド, "
#powder
. "酵母エキスパウダー, クリーミングパウダー, "
#powder
. "昆布粉末, 粉末醤油, 粉末酒, かつお節粉末, マカ粉末, 粉末しょうゆ, 発酵黒にんにく末, "
# roasted
. "ローストバターパウダー, ロースト-麦芽,"
# fried garlic powder
. "フライドガーリックパウダー, "
# pulp
. "りんごパルプ, "
},
[
{
'id' => 'en:flaked-almonds',
'text' => "\x{30b9}\x{30e9}\x{30a4}\x{30b9}\x{30a2}\x{30fc}\x{30e2}\x{30f3}\x{30c9}"
},
{
'id' => 'en:yeast-extract-powder',
'text' => "\x{9175}\x{6bcd}\x{30a8}\x{30ad}\x{30b9}\x{30d1}\x{30a6}\x{30c0}\x{30fc}"
},
{
'id' => "ja:\x{30af}\x{30ea}\x{30fc}\x{30df}\x{30f3}\x{30b0}\x{30d1}\x{30a6}\x{30c0}\x{30fc}",
'text' => "\x{30af}\x{30ea}\x{30fc}\x{30df}\x{30f3}\x{30b0}\x{30d1}\x{30a6}\x{30c0}\x{30fc}"
},
{
'id' => 'en:kombu',
'processing' => 'en:powder',
'text' => "\x{6606}\x{5e03}"
},
{
'id' => 'en:soy-sauce',
'processing' => 'en:powder',
'text' => "\x{91a4}\x{6cb9}"
},
{
'id' => "ja:\x{7c89}\x{672b}\x{9152}",
'text' => "\x{7c89}\x{672b}\x{9152}"
},
{
'id' => 'en:bonito-flakes',
'processing' => 'en:powder',
'text' => "\x{304b}\x{3064}\x{304a}\x{7bc0}"
},
{
'id' => "ja:\x{30de}\x{30ab}\x{7c89}\x{672b}",
'text' => "\x{30de}\x{30ab}\x{7c89}\x{672b}"
},
{
'id' => 'en:soy-sauce',
'processing' => 'en:powder',
'text' => "\x{3057}\x{3087}\x{3046}\x{3086}"
},
{
'id' => "ja:\x{767a}\x{9175}\x{9ed2}\x{306b}\x{3093}\x{306b}\x{304f}\x{672b}",
'text' => "\x{767a}\x{9175}\x{9ed2}\x{306b}\x{3093}\x{306b}\x{304f}\x{672b}"
},
{
'id' => 'en:butter',
'processing' => 'en:powder, en:roasted',
'text' => "\x{30d0}\x{30bf}\x{30fc}"
},
{
'id' => 'en:malt',
'processing' => 'en:roasted',
'text' => "\x{9ea6}\x{82bd}"
},
{
'id' => 'en:garlic',
'processing' => 'en:powder, en:fried',
'text' => "\x{30ac}\x{30fc}\x{30ea}\x{30c3}\x{30af}"
},
{
'id' => 'en:apple-pulp',
'text' => "\x{308a}\x{3093}\x{3054}\x{30d1}\x{30eb}\x{30d7}"
}
]

],
);

foreach my $test_ref (@tests) {
Expand Down