Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decide whether wasm binsize benchmark should include data #4067

Open
sffc opened this issue Sep 21, 2023 · 9 comments
Open

Decide whether wasm binsize benchmark should include data #4067

sffc opened this issue Sep 21, 2023 · 9 comments
Assignees
Labels
C-test-infra Component: Integration test infrastructure S-small Size: One afternoon (small bug fix or enhancement)

Comments

@sffc
Copy link
Member

sffc commented Sep 21, 2023

The binsize benchmark shows a regression in the datetime code size between these two dates. It was offline during that time, so it's not entirely clear what the culprit commits were. I imagine that a lot of this is due to calendrical calculations, but we should know for sure so that we can better understand and plan out mitigations.

First step would be to bisect the time between these commits (cfe1f71 and ddb1a7b) and find the commits that most contributed to the spike.

Also, these tests use TypedZonedDateTimeFormatter::<Gregorian>, so in theory they shouldn't be impacted by the calendarical calculation code.

@sffc sffc added good first issue Good for newcomers C-test-infra Component: Integration test infrastructure S-small Size: One afternoon (small bug fix or enhancement) labels Sep 21, 2023
@sffc sffc mentioned this issue Sep 21, 2023
18 tasks
@sffc
Copy link
Member Author

sffc commented Oct 5, 2023

Note: the regressions are in work_log+opt.wasm and tui+opt.wasm

@sffc
Copy link
Member Author

sffc commented Oct 5, 2023

First step is to bisect.

Once bisect is finished, we should add this back to the triage list.

@sffc
Copy link
Member Author

sffc commented Oct 10, 2023

Command line: rm -r wasmpkg; cargo make wasm-opt; ll wasmpkg/wasm-opt/work_log+opt.wasm; ll wasmpkg/wasm-opt/tui+opt.wasm

Some commits failed to build. Some commits around 2023-02 failed with:

error: failed to run custom build command for `sys-info v0.9.1`

Caused by:
  process didn't exit successfully: `/.../icu4x/target/release-opt-size/build/sys-info-62b5cf4053fe346f/build-script-build` (exit status: 101)
  --- stderr
  thread 'main' panicked at 'unsupported system: unknown', /.../.cargo/registry/src/github.com-1ecc6299db9ec823/sys-info-0.9.1/build.rs:38:14

Other commits from 2023-04 to 2023-08, starting with ca28250 and fixed in c8c9126, failed with:

   Compiling criterion v0.4.0
error: Rayon cannot be used when targeting wasi32. Try disabling default features.
  --> /.../.cargo/registry/src/github.com-1ecc6299db9ec823/criterion-0.4.0/src/lib.rs:31:1
   |
31 | compile_error!("Rayon cannot be used when targeting wasi32. Try disabling default features.");
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Bisection results from the commits that built out of the box in reverse chronological order:

3810109

-rw-r--r-- 1 sffc primarygroup 378730 Oct  9 17:24 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 3183827 Oct  9 17:24 wasmpkg/wasm-opt/tui+opt.wasm

13b6b4e

-rw-r--r-- 1 sffc primarygroup 358905 Oct  9 17:40 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 3151534 Oct  9 17:40 wasmpkg/wasm-opt/tui+opt.wasm

ea1ba9f

-rw-r--r-- 1 sffc primarygroup 356957 Oct  9 17:57 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 3154623 Oct  9 17:57 wasmpkg/wasm-opt/tui+opt.wasm

c8c9126

-rw-r--r-- 1 sffc primarygroup 366215 Oct  9 18:03 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 3165598 Oct  9 18:03 wasmpkg/wasm-opt/tui+opt.wasm

(long period of silence due to bad lockfile)

eb79e5f

-rw-r--r-- 1 sffc primarygroup 119364 Oct  9 17:53 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 513517 Oct  9 17:53 wasmpkg/wasm-opt/tui+opt.wasm

f6bfe8b

-rw-r--r-- 1 sffc primarygroup 119418 Oct  9 17:47 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 520134 Oct  9 17:47 wasmpkg/wasm-opt/tui+opt.wasm

cfe1f71

-rw-r--r-- 1 sffc primarygroup 118988 Oct  9 17:28 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 519593 Oct  9 17:28 wasmpkg/wasm-opt/tui+opt.wasm

So the regression was during the summer, sometime between 2023-04 and 2023-08.

@sffc
Copy link
Member Author

sffc commented Oct 10, 2023

I will try using a proxy metric, the linux example bin size with release-opt-size:

cargo +nightly-2022-12-26 build --profile release-opt-size --examples --workspace --features serde --exclude icu_datagen --exclude icu_capi; ll target/release-opt-size/examples/work_log; ll target/release-opt-size/examples/tui

c8c9126

-rwxr-xr-x 2 sffc primarygroup 3521488 Oct  9 18:07 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 6749624 Oct  9 18:07 target/release-opt-size/examples/tui*

4ed5dbd

-rwxr-xr-x 2 sffc primarygroup 3529376 Oct  9 18:13 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 6754240 Oct  9 18:13 target/release-opt-size/examples/tui*

(in this period, there is an error while building examples; do not know how that could have landed, perhaps a feature resolution issue)

7f10b15

-rwxr-xr-x 2 sffc primarygroup 1960928 Oct  9 18:20 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2409400 Oct  9 18:20 target/release-opt-size/examples/tui*

004581f

-rwxr-xr-x 2 sffc primarygroup 1949864 Oct  9 18:20 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2396512 Oct  9 18:20 target/release-opt-size/examples/tui*

09ad8b7

-rwxr-xr-x 2 sffc primarygroup 1945776 Oct  9 18:18 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2388376 Oct  9 18:18 target/release-opt-size/examples/tui*

244a499

-rwxr-xr-x 2 sffc primarygroup 1945776 Oct  9 18:14 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2388376 Oct  9 18:14 target/release-opt-size/examples/tui*

d8cc2f3

-rwxr-xr-x 2 sffc primarygroup 1957376 Oct  9 18:12 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2387232 Oct  9 18:12 target/release-opt-size/examples/tui*

eb79e5f

-rwxr-xr-x 2 sffc primarygroup 1963344 Oct  9 18:10 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2398008 Oct  9 18:10 target/release-opt-size/examples/tui*

@sffc
Copy link
Member Author

sffc commented Oct 10, 2023

I did some manual work to get the intermediate broken commits to build. Doing so, I found that for work_log, there are two commits that together account for almost the entire regression:

  1. 7e2733d Baked data for icu_datetime (Baked data for icu_datetime #3591)
  2. 7b8ad79 Add recommended locale set and expand regions in datagen (Add recommended locale set options and expand regions in datagen #3586)

In other words, the examples now use compiled data, whereas before they used buffer provider data from a buffer that was not included as part of the measurement. Mystery solved.

Now, the follow-up question is, do we want to continue measuring the wasm binsize with data included, or do we want to measure it without data? If we measure it with data, we're measuring something that "actually works", whereas if we measure it without data, we can detect regressions in code size at a much more granular level.

@sffc sffc removed their assignment Oct 10, 2023
@sffc sffc added discuss Discuss at a future ICU4X-SC meeting and removed good first issue Good for newcomers labels Oct 10, 2023
@sffc sffc changed the title Investigate spike in datetime binsize between 2022-12-29 and 2023-08-21 Decide whether wasm binsize should include data Oct 10, 2023
@sffc sffc changed the title Decide whether wasm binsize should include data Decide whether wasm binsize benchmark should include data Oct 10, 2023
@sffc sffc removed this from the 1.4 Blocking ⟨P1⟩ milestone Oct 10, 2023
@robertbastian
Copy link
Member

Ah of course.

If we call these code size benchmarks we should use empty data. We could have a data size benchmark v2 that is the difference between all data and no data.

@robertbastian
Copy link
Member

robertbastian commented Oct 10, 2023

Actually for singleton keys there's no such thing as "empty data", and removing all data will also remove fallback logic (and data). I think including just und could be a good middle ground.

@robertbastian
Copy link
Member

Without the region-select option it's tricky to include exactly one data struct.

@sffc sffc self-assigned this Oct 19, 2023
@sffc
Copy link
Member Author

sffc commented Oct 19, 2023

@sffc to investigate and implement fix if necessary.

@sffc sffc added this to the 1.4 Blocking ⟨P1⟩ milestone Oct 19, 2023
@sffc sffc removed the discuss Discuss at a future ICU4X-SC meeting label Oct 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-test-infra Component: Integration test infrastructure S-small Size: One afternoon (small bug fix or enhancement)
Projects
None yet
Development

No branches or pull requests

3 participants