Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stat benchmark: report cleanup time and optimise #692

Merged
merged 2 commits into from
Feb 14, 2024

Conversation

talex5
Copy link
Collaborator

@talex5 talex5 commented Feb 14, 2024

I want to make some improvements here. But let's start by benchmarking the current state of things.

Initially, I get:

+Using linux backend
+Running Path.stat...
+Going to create 168420 files and directories
+Created in 2.37 s
+Statted in 1.32 s
+Removed in 5.55 s

+Using posix backend
+Running Path.stat...
+Going to create 168420 files and directories
+Created in 8.61 s
+Statted in 4.10 s
+Removed in 20.41 s

Now, instead of creating thousands of fibers and having them fight over a semaphore, limit the number of fibers created (this also makes the traces easier to view). Also, fill the files with zero bytes instead of asking the OS for secure random data, since that's slow and isn't useful for the test.

On my machine:

+Using linux backend                   
+Running Path.stat...
+Going to create 168420 files and directories
+Created in 1.62 s
+Statted in 1.04 s
+Removed in 8.00 s

+Using posix backend                   
+Running Path.stat...
+Going to create 168420 files and directories
+Created in 2.92 s
+Statted in 3.82 s
+Removed in 22.27 s

On CI:

Screenshot 2024-02-14 at 11-53-31 OCaml Benchmarks

Interestingly, removal was a bit slower in all three cases, even though that bit wasn't changed!

@talex5 talex5 marked this pull request as ready for review February 14, 2024 12:04
- Instead of creating thousands of fibers and having them fight over a
  semaphore, limit the number of fibers created. This also makes the
  traces easier to view.

- Remove the items in parallel.

- Fill the files with zero bytes instead of asking the OS for secure
  random data, since that's slow and isn't useful for the test.
@talex5
Copy link
Collaborator Author

talex5 commented Feb 14, 2024

After making removal parallel, things are much better (on Linux):

Screenshot 2024-02-14 at 12-56-37 OCaml Benchmarks

@talex5
Copy link
Collaborator Author

talex5 commented Feb 14, 2024

On one run with eio_posix, it deadlocked. Two systhreads were waiting in st_masterlock_acquire to get the master lock, even though it wasn't busy!

(gdb) bt
#0  __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, 
    futex_word=0x56397346e014 <thread_table+116>) at ./nptl/futex-internal.c:57
#1  __futex_abstimed_wait_common (futex_word=futex_word@entry=0x56397346e014 <thread_table+116>, expected=expected@entry=0, 
    clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0, cancel=cancel@entry=true) at ./nptl/futex-internal.c:87
#2  0x00007f2c88741e0b in __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x56397346e014 <thread_table+116>, 
    expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0)
    at ./nptl/futex-internal.c:139
#3  0x00007f2c88744468 in __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x56397346dfb0 <thread_table+16>, 
    cond=0x56397346dfe8 <thread_table+72>) at ./nptl/pthread_cond_wait.c:503
#4  ___pthread_cond_wait (cond=cond@entry=0x56397346dfe8 <thread_table+72>, mutex=mutex@entry=0x56397346dfb0 <thread_table+16>)
    at ./nptl/pthread_cond_wait.c:618
#5  0x00005639732f72e0 in st_masterlock_acquire (m=0x56397346dfa8 <thread_table+8>)
    at /home/user/.opam/5.1.1/.opam-switch/build/ocaml-base-compiler.5.1.1/otherlibs/systhreads/st_pthreads.h:159
#6  0x00005639732f7351 in thread_lock_acquire (dom_id=<optimized out>) at st_stubs.c:126
#7  caml_thread_leave_blocking_section () at st_stubs.c:254
#8  0x0000563973321d46 in caml_leave_blocking_section () at runtime/signals.c:171
#9  0x00005639732fa6fb in caml_unix_close (fd=<optimized out>) at close_unix.c:25

(gdb) fr 5

(gdb) p m.waiters
$3 = 2
(gdb) p m.busy
$4 = 0

@talex5 talex5 changed the title Stat benchmark: report cleanup time too Stat benchmark: report cleanup time and optimise Feb 14, 2024
@talex5 talex5 merged commit f5232a0 into ocaml-multicore:main Feb 14, 2024
5 checks passed
@talex5 talex5 deleted the stat-bench branch February 14, 2024 16:09
@talex5
Copy link
Collaborator Author

talex5 commented Feb 14, 2024

Will investigate the deadlock separately; it's clearly not the fault of this PR.

talex5 added a commit to talex5/opam-repository that referenced this pull request Feb 22, 2024
CHANGES:

New features:

- eio_posix: use directory FDs instead of realpath (@talex5 ocaml-multicore/eio#694 ocaml-multicore/eio#696, reviewed by @SGrondin).
  Using realpath was an old hack from the libuv days and subject to races. It was also slow.

- Keep pool of systhreads for blocking operations (@SGrondin @talex5 ocaml-multicore/eio#681).
  This is much faster than creating a new thread for each operation.
  It mainly benefits the eio_posix backend, as that uses lots of systhreads.

- Make `Switch.on_release` thread-safe (@talex5 ocaml-multicore/eio#684, requested by @art-w and @clecat).
  This allows resource pools to be shared between domains easily.

- Add `Eio.Path.read_link` (@talex5 ocaml-multicore/eio#686).

- Add `Eio_unix.Fd.is_open` (@talex5 ocaml-multicore/eio#690).

- Include backtrace in systhread errors (@talex5 ocaml-multicore/eio#688, reviewed by @SGrondin).
  Also, add `Eio.Exn.empty_backtrace` as a convenience.

- eio.mock: add tracing support to mock backend (@talex5 ocaml-multicore/eio#687).

- Improve tracing (@talex5 ocaml-multicore/eio#675 ocaml-multicore/eio#683 ocaml-multicore/eio#676, reviewed by @SGrondin).
  Update tracing section of README and trace more things
  (`run_in_systhread`, `close`, `submit`, `traceln`, cancellation and domain spawning).

Documentation:

- Link to verification work in docs (@talex5 ocaml-multicore/eio#682).

- Add more trace diagrams to README (@talex5 ocaml-multicore/eio#698).

- Adjust COC contacts (@polytypic ocaml-multicore/eio#685, reviewed by @Sudha247).

Bug fixes:

- eio_linux: retry `openat2` on `EAGAIN` (@talex5 ocaml-multicore/eio#693, reviewed by @SGrondin).

- eio_posix and eio_windows: check for IO periodically (@talex5 ocaml-multicore/eio#674).

- Handle EPERM when trying to initialise uring (@talex5 ocaml-multicore/eio#691).
  This can happen when using a Docker container.

Build and tests:

- Benchmark `Eio_unix.run_in_systhread` (@talex5 ocaml-multicore/eio#678, reviewed by @SGrondin).

- Enable lintcstubs for `Eio_unix.Private` too (@talex5 ocaml-multicore/eio#689).

- Stat benchmark: report cleanup time and optimise (@talex5 ocaml-multicore/eio#692).

- Make benchmarks start faster (@talex5 ocaml-multicore/eio#673).

- Update build for new eio-trace CLI (@talex5 ocaml-multicore/eio#699).

- Expect opam-repo-ci tests to fail on macos (@talex5 ocaml-multicore/eio#672).
talex5 added a commit to talex5/opam-repository that referenced this pull request Feb 22, 2024
CHANGES:

New features:

- eio_posix: use directory FDs instead of realpath (@talex5 ocaml-multicore/eio#694 ocaml-multicore/eio#696, reviewed by @SGrondin).
  Using realpath was an old hack from the libuv days and subject to races. It was also slow.

- Keep pool of systhreads for blocking operations (@SGrondin @talex5 ocaml-multicore/eio#681).
  This is much faster than creating a new thread for each operation.
  It mainly benefits the eio_posix backend, as that uses lots of systhreads.

- Make `Switch.on_release` thread-safe (@talex5 ocaml-multicore/eio#684, requested by @art-w and @clecat).
  This allows resource pools to be shared between domains easily.

- Add `Eio.Path.read_link` (@talex5 ocaml-multicore/eio#686).

- Add `Eio_unix.Fd.is_open` (@talex5 ocaml-multicore/eio#690).

- Include backtrace in systhread errors (@talex5 ocaml-multicore/eio#688, reviewed by @SGrondin).
  Also, add `Eio.Exn.empty_backtrace` as a convenience.

- eio.mock: add tracing support to mock backend (@talex5 ocaml-multicore/eio#687).

- Improve tracing (@talex5 ocaml-multicore/eio#675 ocaml-multicore/eio#683 ocaml-multicore/eio#676, reviewed by @SGrondin).
  Update tracing section of README and trace more things
  (`run_in_systhread`, `close`, `submit`, `traceln`, cancellation and domain spawning).

Documentation:

- Link to verification work in docs (@talex5 ocaml-multicore/eio#682).

- Add more trace diagrams to README (@talex5 ocaml-multicore/eio#698).

- Adjust COC contacts (@polytypic ocaml-multicore/eio#685, reviewed by @Sudha247).

Bug fixes:

- eio_linux: retry `openat2` on `EAGAIN` (@talex5 ocaml-multicore/eio#693, reviewed by @SGrondin).

- eio_posix and eio_windows: check for IO periodically (@talex5 ocaml-multicore/eio#674).

- Handle EPERM when trying to initialise uring (@talex5 ocaml-multicore/eio#691).
  This can happen when using a Docker container.

Build and tests:

- Benchmark `Eio_unix.run_in_systhread` (@talex5 ocaml-multicore/eio#678, reviewed by @SGrondin).

- Enable lintcstubs for `Eio_unix.Private` too (@talex5 ocaml-multicore/eio#689).

- Stat benchmark: report cleanup time and optimise (@talex5 ocaml-multicore/eio#692).

- Make benchmarks start faster (@talex5 ocaml-multicore/eio#673).

- Update build for new eio-trace CLI (@talex5 ocaml-multicore/eio#699).

- Expect opam-repo-ci tests to fail on macos (@talex5 ocaml-multicore/eio#672).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant