-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Repo might be unnecessarily large #242
Comments
Yes, thats a fair observation. @Schultzer has done a ton of great work to build an improved test harness which I will integrate for the next release cycle in November (CLDR 46 will be out this coming week and while I've done basic integration testing I need to fix some regressions and redesign issues in I'm very open to ideas on how to restructure the repo, as long as any running version of the code can download any individual locale appropriate for that release. I don't think a single release tarball accomplishes that goal. But I will be the first to admit that my GitHub actions-fu is very poor. Its an area contributions will always be welcome. |
I'm also a bit curious about your use case for the full repo. It not something that comes up often simply because it's only an issue for maintainers, and there's only three of us working on any development, and more than 90% of that is just me. You are of course welcome to have at it, I'm just curious. |
I see this is explained in the readme,
Please feel free to close the issue! I think I understand better now: the github repo includes locale output targets which can be downloaded by binary-only installations of elixir-cldr. The project source and outputs could be kept in separate repos, but there are trade-offs either way so the status quo is a perfectly good arrangement. DEVELOPMENT.md mentions that cldr itself uses git-lfs. Since the LFS tooling is needed anyway, I wonder if it would be helpful for this repo as well, eg. using it to clean up the upstream priv/ data so that it becomes optional to pull its history? |
+1 I'm impressed by the huge effort that goes into maintenance of this very complete library, and I should be clear that I have no pressing use case at the moment other than curiosity. My comments here are more in the spirit of sharing newbie observations ("explain like I'm five" :-) ), not that I have some external project blocked on ex-cldr at all. But I would be happy to share how I arrived here, making a bit of a nuisance! I only need the full repo because I'd like to get access to all languages. For day job I did a small investigation into how CLDR data might be used to improve an application which supports many hundreds of languages. The immediate use case would be to have an Elixir library which contained the correctly parsed core alphabets from known locales, now that I've discovered that the LDML format is nontrivial (sorry if this is an understatement ;-) ). Other use cases which I don't have to support today, but which come to mind for my domain, are "let the user freely pick their interface locale from full database", and producing new transformations of the data which cut across locales eg. "dump the currency code names for all languages". |
All good use cases! If you want to leverage the elixir-formatted complete CLDR data set then you can do something like this: iex> config = %Cldr.Config{locales: :all}
%Cldr.Config{
default_locale: "en-001",
locales: :all,
add_fallback_locales: false,
backend: nil,
gettext: nil,
data_dir: "cldr",
providers: nil,
precompile_number_formats: [],
precompile_transliterations: [],
precompile_date_time_formats: [],
precompile_interval_formats: [],
default_currency_format: nil,
otp_app: nil,
generate_docs: true,
suppress_warnings: false,
message_formats: %{},
force_locale_download: false,
https_proxy: nil
}
iex> locales = Cldr.Locale.Loader.known_locale_names(config)
[:aa, :"aa-DJ", :"aa-ER", :ab, :af, :"af-NA", :agq, :ak, :am, :an, :ann, :apc,
:ar, :"ar-AE", :"ar-BH", :"ar-DJ", :"ar-DZ", :"ar-EG", :"ar-EH", :"ar-ER",
:"ar-IL", :"ar-IQ", :"ar-JO", :"ar-KM", :"ar-KW", :"ar-LB", :"ar-LY", :"ar-MA",
:"ar-MR", :"ar-OM", :"ar-PS", :"ar-QA", :"ar-SA", :"ar-SD", :"ar-SO", :"ar-SS",
:"ar-SY", :"ar-TD", :"ar-TN", :"ar-YE", :arn, :as, :asa, :ast, :az, :"az-Arab",
:"az-Arab-IQ", :"az-Arab-TR", :"az-Cyrl", :"az-Latn", ...]
iex> for locale <- locales do
...> local_data_map = Cldr.Locale.Loader.get_locale(locale, config)
...> # do something with the data ....
...> end
What I would probably do in your case is depend upon only |
Always great to have fresh eyes on the project - it's a huge topic before one even gets to implementation of code. And there is always room for much improvement. So please do keep the comments/suggestions coming. |
There are excellent reasons to keep a full git history, but as a newcomer to the project I'm surprised to encounter a 1.1GB repository. It seems that much of this consists of binary, compiled changes to the CLDR data itself, under
priv/
.Old versions can still be supported by preserving a release tarball of elixir-cldr for each historical CLDR release, for example. But is there a reason to keep this extreme history in the repo itself, still?
The text was updated successfully, but these errors were encountered: