Skip to content
This repository has been archived by the owner on Mar 29, 2024. It is now read-only.

unic_datetime #13

Closed
zbraniecki opened this issue Jan 10, 2020 · 11 comments
Closed

unic_datetime #13

zbraniecki opened this issue Jan 10, 2020 · 11 comments

Comments

@zbraniecki
Copy link
Member

zbraniecki commented Jan 10, 2020

I started experimenting with DateTime formatter. The codebase is very messy, I apologize for that, but I believe I can share the initial performance results.

For the minimum POC I focused on a single locale (pl) and 10 different combinations of dateStyle/timeStyle. I omitted two which require timezone names.

Since DateTime patterns take much more space than pluralrules or locale RTL/likelySubtags data which I worked with before, I experimented with three different models of loading data, as per #5:

  • Patterns are parsed and inlined into .rs code
  • Patterns are fetched and parsed from JSON CLDR
  • Patterns are fetched from an already parsed binary resource file

For the binary scenario I used bincode crate, fetched JSON CLDR, parsed the patterns, and serialized the resulting structure to a res file, which I then loaded into memory at runtime.

I got the following results:

  • ICU4C_65 - 849 us
  • UNIC JSON CLDR - 150 us
  • UNIC bin resource - 70 us
  • UNIC inlined - 24 us

I'd appreciate if someone could try to replicate my measurements and verify the results.

If the results hold, I believe this to be one more evidence that investment in Rust based crates may lead to significant performance gains.

I haven't evaluate memory use, but the ca-gregory.json file for pl is 18201 bytes and pl.res generated from it is 2187 bytes. I'd expect the latter to take the same amount in memory.

@filmil
Copy link

filmil commented Jan 10, 2020

HI @zbraniecki , thanks for sharing. I suppose it's https://github.com/zbraniecki/unic-datetime?

@zbraniecki
Copy link
Member Author

zbraniecki commented Jan 10, 2020

Yes! Sorry for not linking. In my measurements repo I provided steps to reproduce - https://github.com/zbraniecki/intl-measurements/#how-to-run

It uses a test crate which fetches unic-datetime - https://github.com/zbraniecki/intl-measurements/tree/master/unic/datetime

Please, note that the source code is a mess. I've been migrating between different layout models for data and I plan to clean it all up.
The latest revision supports only JSON/bin resource, so for inlined results I used revision f2169e6ad215eee0438304e83e00bd36df43408c and cargo run --example test --release.

For JSON/binary master works and cargo run --example binary --features binary --release (bincode) and cargo run --example dynamic --features binary --release (JSON) should produce the measurements.

Sorry for the mess, I'm in a heavy experimentation phase, but wanted to share initial results :)

@filmil
Copy link

filmil commented Jan 10, 2020

Got this. Seems easy to fix, just unsure if you meant it to not be there. No worries about sharing the early stage of your repo, it's expected. We'll soldier through it.


error[E0599]: no function or associated item named `new_from_static` found for type `DateTimeFormat<_>` in the current scope
   --> src/lib.rs:173:35
    |
84  | pub struct DateTimeFormat<R> {
    | ---------------------------- function or associated item `new_from_static` not found for this
...
173 |         let dtf = DateTimeFormat::new_from_static("pl", Some(DateStyle::LONG), None);
    |                                   ^^^^^^^^^^^^^^^ function or associated item not found in `DateTimeFormat<_>`

error[E0599]: no function or associated item named `new_from_static` found for type `DateTimeFormat<_>` in the current scope
   --> src/lib.rs:176:35
    |
84  | pub struct DateTimeFormat<R> {
    | ---------------------------- function or associated item `new_from_static` not found for this
...
176 |         let dtf = DateTimeFormat::new_from_static("pl", Some(DateStyle::SHORT), None);
    |                                   ^^^^^^^^^^^^^^^ function or associated item not found in `DateTimeFormat<_>`

error[E0599]: no function or associated item named `new_from_static` found for type `DateTimeFormat<_>` in the current scope
   --> src/lib.rs:180:29
    |
84  | pub struct DateTimeFormat<R> {
    | ---------------------------- function or associated item `new_from_static` not found for this
...
180 |             DateTimeFormat::new_from_static("pl", Some(DateStyle::MEDIUM), Some(TimeStyle::MEDIUM));
    |                             ^^^^^^^^^^^^^^^ function or associated item not found in `DateTimeFormat<_>`

error[E0599]: no function or associated item named `new_from_static` found for type `unic_datetime::DateTimeFormat<_>` in the current scope
  --> benches/dates.rs:40:43
   |
40 |                 let dtf = DateTimeFormat::new_from_static(value.0, value.1, value.2);
   |                                           ^^^^^^^^^^^^^^^ function or associated item not found in `unic_datetime::DateTimeFormat<_>`

error: aborting due to previous error

@zbraniecki
Copy link
Member Author

zbraniecki commented Jan 10, 2020

Hmm, tested again:

cd ~/projects/unic-datetime
git co f2169e6
cargo run --example test --features binary --release // us: 24
git co master
mkdir data
cd data
git clone https://github.com/unicode-cldr/cldr-dates-modern
cd ../
cargo run --example dynamic --features binary --release // us: 150
cargo run --example binary --features binary --release // us: 73

Can you pull and retest pls?

@zbraniecki
Copy link
Member Author

Also, that seems to work:

cd ~/projects/intl-measurements/unic/datetime
cargo update
cargo run --release // us: 142, if json = true
cargo run --release // us: 72, if json = false

@filmil
Copy link

filmil commented Jan 10, 2020

@zbraniecki trying out the following:

cd ~/projects/unic-datetime
git co f2169e6
cargo run --example test --features binary --release // us: 24
git co master
cargo run --example dynamic --features binary --release // us: 150
cargo run --example binary --features binary --release // us: 73

I got to this at cd08dff908bef647423b73e7c673a592f514d5cf

╰─>$ env RUST_BACKTRACE=full cargo run --example dynamic --features binary --release // us: 150
(...snipped compilation output...)
    Finished release [optimized] target(s) in 0.04s
     Running `target/release/examples/dynamic // 'us:' 150`
thread 'main' panicked at 'Something went wrong reading the file: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/libcore/result.rs:1189:5
stack backtrace:
   0:     0x55871d822d84 - backtrace::backtrace::libunwind::trace::hb508478e95b202cd
                               at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/libunwind.rs:88
   1:     0x55871d822d84 - backtrace::backtrace::trace_unsynchronized::h1009cac69ed78240
                               at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/mod.rs:66
   2:     0x55871d822d84 - std::sys_common::backtrace::_print_fmt::h3186a1dc2744d8b2
                               at src/libstd/sys_common/backtrace.rs:77
   3:     0x55871d822d84 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h67340410f877e6f6
                               at src/libstd/sys_common/backtrace.rs:59
   4:     0x55871d83e47c - core::fmt::write::h1e0327a5a7e673ea
                               at src/libcore/fmt/mod.rs:1057
   5:     0x55871d821277 - std::io::Write::write_fmt::h7c4eb67ef88873d7
                               at src/libstd/io/mod.rs:1426
   6:     0x55871d824cee - std::sys_common::backtrace::_print::h9039c88779a1a378
                               at src/libstd/sys_common/backtrace.rs:62
   7:     0x55871d824cee - std::sys_common::backtrace::print::h2d1083e3add8ef0f
                               at src/libstd/sys_common/backtrace.rs:49
   8:     0x55871d824cee - std::panicking::default_hook::{{closure}}::h1682e6a010f77aa1
                               at src/libstd/panicking.rs:195
   9:     0x55871d8249e1 - std::panicking::default_hook::h8b15d779994485fc
                               at src/libstd/panicking.rs:215
  10:     0x55871d82531b - std::panicking::rust_panic_with_hook::hb48f567395094123
                               at src/libstd/panicking.rs:472
  11:     0x55871d824ece - rust_begin_unwind
                               at src/libstd/panicking.rs:376
  12:     0x55871d83cd8e - core::panicking::panic_fmt::he17d215317e06412
                               at src/libcore/panicking.rs:84
  13:     0x55871d83ce17 - core::result::unwrap_failed::h65afb4e92ee29273
                               at src/libcore/result.rs:1189
  14:     0x55871d8010cb - unic_datetime::data::load3::get_calendar_data::h58c64a7facb16336
  15:     0x55871d7fcf99 - dynamic::main::he70cd5503cb82609
  16:     0x55871d7fb463 - std::rt::lang_start::{{closure}}::hbe3af6975f980918
  17:     0x55871d824db3 - std::rt::lang_start_internal::{{closure}}::h1ddf7a461a97f796
                               at src/libstd/rt.rs:52
  18:     0x55871d824db3 - std::panicking::try::do_call::hf936d9156f4acc72
                               at src/libstd/panicking.rs:296
  19:     0x55871d826eba - __rust_maybe_catch_panic
                               at src/libpanic_unwind/lib.rs:79
  20:     0x55871d8257b0 - std::panicking::try::h2994ccbbb78e8bc2
                               at src/libstd/panicking.rs:272
  21:     0x55871d8257b0 - std::panic::catch_unwind::hec078393bb02d028
                               at src/libstd/panic.rs:394
  22:     0x55871d8257b0 - std::rt::lang_start_internal::h4909d0575d54876e
                               at src/libstd/rt.rs:51
  23:     0x55871d7fd462 - main
  24:     0x7f5f648f752b - __libc_start_main
  25:     0x55871d7f917a - _start
  26:                0x0 - <unknown>

@filmil
Copy link

filmil commented Jan 10, 2020

This is getting to be a bit in the weeds, how about we continue at: https://github.com/zbraniecki/unic-datetime/issues?

@zbraniecki
Copy link
Member Author

zbraniecki commented Jan 10, 2020

Ah! I didn't vendor in cldr-dates-modern in the original crate, sorry :(
The example expects there to be ./data/cldr-dates-modern directory in https://github.com/zbraniecki/unic-datetime/blob/master/examples/dynamic.rs#L35

If you want to test in the crate, and not in the intl-measurements, you'll need to do:

mkdir data
cd data
git clone https://github.com/unicode-cldr/cldr-dates-modern
cd ../

@zbraniecki
Copy link
Member Author

The intl-measurements setup is more self-contained.

@zbraniecki
Copy link
Member Author

For the record, initial results from @filmil :

  1. ICU: ~3400 us
  2. UNIC with JSON: ~255 us (-92.5%)
  3. UNIC with bin res: ~150 us (-95.59%)

@sffc
Copy link
Member

sffc commented Jun 29, 2020

Closing this issue in favor of unicode-org/icu4x#163

@sffc sffc closed this as completed Jun 29, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants