Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wasm][icu] Canonical locales are not in filters but users would like to use them #79816

Closed
ilonatommy opened this issue Dec 19, 2022 · 3 comments
Assignees
Labels
Milestone

Comments

@ilonatommy
Copy link
Member

ilonatommy commented Dec 19, 2022

in ICU we have a list of commonly used locales:
https://github.com/dotnet/icu/blob/dotnet/main/icu/icu4c/source/common/locid.cpp#L517

Not all of them are listed in the filters for "full" icu data file and some users get confused as they would like to use them. There are 3 locales that were requested in the discussion (#53239 and #43398):

  • "cy-GB"
  • "is-IS"
  • "bs-BA".

I have no broader overview how frequently they are used, each of them was mentioned once in the issue.

Size effect:
if we make a bold assumption that ICU bundle was affected only by adding these 3 locales then size increase is <3%.
Previous size: 1490 kB,
Current size: 1533 kB.

I would like to underline that these are not the only canonical locales we are missing. These are only these, that got reported, so apparently are in use. If we wanted to patch all the missing ones it would be 36 of them:

"af_ZA", 
"ar_001", 
"as_IN", 
"az_AZ",  
"be_BY", 
"bs_BA", 
"cy_GB", 
 "eu_ES", 
"ga_IE", 
"gu_IN", 
"hy_AM", 
"is_IS",  
"jv_ID", 
"ka_GE", 
"kk_KZ", 
"km_KH", 
"ky_KG", 
"lo_LA", 
"mk_MK", 
"mn_MN", 
"my_MM", 
"ne_NP", 
"or_IN", 
"pa_IN", 
"ps_AF", 
"sd_IN",
"si_LK",
"so_SO", 
"sq_AL", 
"tk_TM",
"ur_PK", 
"uz_UZ", 
"yue_Hant", 
"yue_Hant_HK", 
"yue_HK",
"zu_ZA"

Any help regarding making the decision if to add the requested 3 locales is welcome. cc @lewing @directhex

@ilonatommy ilonatommy added this to the 8.0.0 milestone Dec 19, 2022
@ilonatommy ilonatommy self-assigned this Dec 19, 2022
@ghost
Copy link

ghost commented Dec 19, 2022

Tagging subscribers to this area: @dotnet/area-system-globalization
See info in area-owners.md if you want to be subscribed.

Issue Details

in ICU we have a list of commonly used locales:
https://github.com/dotnet/icu/blob/dotnet/main/icu/icu4c/source/common/locid.cpp#L517

Not all of them are listed in the filters for "full" icu data file and some users get confused as they would like to use them. There are 3 locales that were requested in the discussion (#53239):

  • "cy-GB"
  • "is-IS"
  • "bs-BA".

I have no broader overview how frequently they are used, each of them was mentioned once in the issue.

Size effect:
if we make a bold assumption that ICU bundle was affected only by adding these 3 locales then size increase is <3%.
Previous size: 1490 kB,
Current size: 1533 kB.

I would like to underline that these are not the only canonical locales we are missing. These are only these, that got reported, so apparently are in use. If we wanted to patch all the missing ones it would be 36 of them:

"af_ZA", 
"ar_001", 
"as_IN", 
"az_AZ",  
"be_BY", 
"bs_BA", 
"cy_GB", 
 "eu_ES", 
"ga_IE", 
"gu_IN", 
"hy_AM", 
"is_IS",  
"jv_ID", 
"ka_GE", 
"kk_KZ", 
"km_KH", 
"ky_KG", 
"lo_LA", 
"mk_MK", 
"mn_MN", 
"my_MM", 
"ne_NP", 
"or_IN", 
"pa_IN", 
"ps_AF", 
"sd_IN",
"si_LK",
"so_SO", 
"sq_AL", 
"tk_TM",
"ur_PK", 
"uz_UZ", 
"yue_Hant", 
"yue_Hant_HK", 
"yue_HK",
"zu_ZA"

Any help regarding making the decision if to add the requested 3 locales is welcome. cc @lewing @directhex

Author: ilonatommy
Assignees: ilonatommy
Labels:

arch-wasm, area-System.Globalization

Milestone: 8.0.0

@ghost
Copy link

ghost commented Dec 19, 2022

Tagging subscribers to 'arch-wasm': @lewing
See info in area-owners.md if you want to be subscribed.

Issue Details

in ICU we have a list of commonly used locales:
https://github.com/dotnet/icu/blob/dotnet/main/icu/icu4c/source/common/locid.cpp#L517

Not all of them are listed in the filters for "full" icu data file and some users get confused as they would like to use them. There are 3 locales that were requested in the discussion (#53239):

  • "cy-GB"
  • "is-IS"
  • "bs-BA".

I have no broader overview how frequently they are used, each of them was mentioned once in the issue.

Size effect:
if we make a bold assumption that ICU bundle was affected only by adding these 3 locales then size increase is <3%.
Previous size: 1490 kB,
Current size: 1533 kB.

I would like to underline that these are not the only canonical locales we are missing. These are only these, that got reported, so apparently are in use. If we wanted to patch all the missing ones it would be 36 of them:

"af_ZA", 
"ar_001", 
"as_IN", 
"az_AZ",  
"be_BY", 
"bs_BA", 
"cy_GB", 
 "eu_ES", 
"ga_IE", 
"gu_IN", 
"hy_AM", 
"is_IS",  
"jv_ID", 
"ka_GE", 
"kk_KZ", 
"km_KH", 
"ky_KG", 
"lo_LA", 
"mk_MK", 
"mn_MN", 
"my_MM", 
"ne_NP", 
"or_IN", 
"pa_IN", 
"ps_AF", 
"sd_IN",
"si_LK",
"so_SO", 
"sq_AL", 
"tk_TM",
"ur_PK", 
"uz_UZ", 
"yue_Hant", 
"yue_Hant_HK", 
"yue_HK",
"zu_ZA"

Any help regarding making the decision if to add the requested 3 locales is welcome. cc @lewing @directhex

Author: ilonatommy
Assignees: ilonatommy
Labels:

arch-wasm, area-System.Globalization

Milestone: 8.0.0

@ilonatommy
Copy link
Member Author

Closing from similar reason as dotnet/icu#298. We are thinking about custom ICU files and it would solve this issue. #79987

@ghost ghost locked as resolved and limited conversation to collaborators Jan 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant