add lock around all API calls #1021

simonbyrne · 2022-10-27T20:45:26Z

As a potential fix to #835 and #990: this adds a simple lock around all API calls: I was initially worried this would cause a performance bottleneck, but some simple examples didn't show any real difference.

mkitti · 2022-10-27T21:41:16Z

Would it make sense to provide this in two layers

A raw API module without any locks
An Thread-safe API module with locks like you've done here.

Basically we run the generator twice and produce two separate modules.

For the higher level API we may want to engage the locks for a series of low-level API calls rather than locking and unlocking for each API call.

mkitti · 2022-10-27T21:43:39Z

We could even use a Preferences.jl mechanism to enable locks everywhere or not by default.

simonbyrne · 2022-10-27T21:44:03Z

That's what I was originally thinking, but it seems like the overhead is small enough that it might not be worth it?

simonbyrne · 2022-10-31T04:04:29Z

I've tried some more simple test cases (e.g. reading/writing single values in a loop) and I don't see any observable overhead from the lock.

mkitti

How about we create a toggle to make this easier to test?

Another option would be to copy api.jl and make a rawapi.jl.

Then api.jl has use_lock() = true and rawapi.jl has use_lock() = false.

mkitti · 2022-10-31T16:41:13Z

gen/bind_generator.jl

+    jlfuncbody = Expr(
+        :block, __source__, :(lock(liblock)), :($statsym = try
+            $ccallexpr
+        finally
+            unlock(liblock)
+        end)
+    )


Suggested change

jlfuncbody = Expr(

:block, __source__, :(lock(liblock)), :($statsym = try

$ccallexpr

finally

unlock(liblock)

end)

)

jlfuncbody = Expr(

:block, __source__, :(use_lock() && lock(liblock)), :($statsym = try

$ccallexpr

finally

use_lock && unlock(liblock)

end)

)

This adds two runtime lookups at each call

If use_lock is constant, (e.g. use_lock() = false), it gets inlined. The generated code does not actually call use_lock at runtime.

Ah, I saw that's what you did in #1022. That's an option, but I would argue that using the lock should be the default.

mkitti · 2022-10-31T16:42:44Z

src/api/api.jl

@@ -13,6 +13,8 @@ else
    )
 end

+const liblock = ReentrantLock()


Suggested change

const liblock = ReentrantLock()

const liblock = ReentrantLock()

const _USE_LOCK = Ref(true)

use_lock() = _USE_LOCK[]

mkitti · 2022-11-08T03:31:51Z

@simonbyrne What's your reading of JuliaLang/julia#35689 ? It sounds to me like there may be an issue if finalizers acquire locks at the moment, with that issue still open.

simonbyrne · 2022-11-08T05:31:02Z

@simonbyrne What's your reading of JuliaLang/julia#35689 ? It sounds to me like there may be an issue if finalizers acquire locks at the moment, with that issue still open.

I think that is addressed by JuliaLang/julia#38487, which prevents finalizers from running on the current thread until the lock is released.

mkitti · 2022-11-08T05:53:23Z

That's strange that a pull request with a lower number resolves an issue with a higher number.

musm · 2022-11-09T20:53:26Z

@mkitti I haven't fully investigated the Preferences.jl solution, but this seems a lot simpler and since there's no tradeoff with these calls, I don't see a strong reason to make it configurable. Are there any use cases for users to want to turn it off other than devs wanting to experiment?

mkitti · 2022-11-09T21:48:21Z

The statement that "there is no tradeoff" still needs to be clearly established. Anecdotal testing does suggest that the effect is small, but we do not have a way of comprehensively establishing this for all applications. Thus, we need to provide a way for the user to do the experiment with their specific applications.

The result "there is no tradeoff" does seem surprising. A single HDF5.jl function usually makes several C API calls. Locking and unlocking several times within a high level function call does seem inefficient. Is there really no cost to this at all? If there were no cost, then why not enable thread safety within the C API by default?

mkitti · 2022-11-10T09:24:26Z

With lock:

Test Summary:                      | Pass  Broken  Total     Time
HDF5.jl                            | 1085       5   1090  1m55.2s
  plain                            |  151       1    152    23.1s
  complex                          |   13             13     0.8s
  undefined and null               |    4              4     0.0s
  abstract arrays                  |    2              2     0.6s
  empty and 0-size arrays          |   39             39     0.6s
  generic read of native types     |   17             17     0.6s
  show                             |   51             51     2.8s
  split1                           |   13             13     0.3s
  haskey                           |   18             18     0.5s
  AbstractString                   |   51             51     4.3s
  opaque data                      |    7              7     1.2s
  FixedStrings and FixedArrays     |   18             18     1.0s
  Object Exists                    |    6              6     0.0s
  HDF5 existance                   |    4              4     0.0s
  bounds                           |    2              2     0.2s
  h5a_iterate                      |    8              8     0.2s
  h5l_iterate                      |    8              8     0.2s
  compound                         |   10             10     3.6s
  custom                           |    6              6     0.7s
  reference                        |    6              6     0.2s
  Dataspaces                       |   91             91     0.4s
  Datatypes                        |   15             15     0.1s
  BlockRange                       |   35             35     0.3s
  hyperslab                        |    5              5     1.2s
  read 0-length arrays: issue #859 |                None     0.2s
  attrs interface                  |   92             92     1.3s
  readremote                       |   23             23     1.6s
  extend                           |   25             25     1.7s
  gc                               |  101            101     3.8s
  external                         |    6              6     0.1s
  swmr                             |    4              4     8.2s
  mmap                             |    9              9     1.0s
  properties                       |   42       2     44     2.0s
  filter                           |   48       1     49    14.0s
  Raw Chunk I/O                    |   52             52     4.6s
  fileio                           |    6              6     1.8s
  track order                      |   18             18     2.4s
  h5f_get_dset_no_attrs_hint       |    6              6     0.4s
  non-allocating methods           |   11       1     12     1.2s
  Compression Filter Unit Tests    |    6              6     0.2s
  Object API                       |   38             38     0.8s
  virtual dataset                  |    5              5     0.9s
  API Lock Preference              |    5              5     0.7s
     Testing HDF5 tests passed

Without lock:

Test Summary:                      | Pass  Broken  Total     Time
HDF5.jl                            | 1085       5   1090  1m40.0s
  plain                            |  151       1    152    21.9s
  complex                          |   13             13     0.8s
  undefined and null               |    4              4     0.0s
  abstract arrays                  |    2              2     0.6s
  empty and 0-size arrays          |   39             39     0.6s
  generic read of native types     |   17             17     0.6s
  show                             |   51             51     2.8s
  split1                           |   13             13     0.3s
  haskey                           |   18             18     0.4s
  AbstractString                   |   51             51     4.0s
  opaque data                      |    7              7     1.2s
  FixedStrings and FixedArrays     |   18             18     0.9s
  Object Exists                    |    6              6     0.0s
  HDF5 existance                   |    4              4     0.0s
  bounds                           |    2              2     0.2s
  h5a_iterate                      |    8              8     0.2s
  h5l_iterate                      |    8              8     0.2s
  compound                         |   10             10     3.5s
  custom                           |    6              6     0.7s
  reference                        |    6              6     0.2s
  Dataspaces                       |   91             91     0.4s
  Datatypes                        |   15             15     0.0s
  BlockRange                       |   35             35     0.3s
  hyperslab                        |    5              5     1.2s
  read 0-length arrays: issue #859 |                None     0.2s
  attrs interface                  |   92             92     1.3s
  readremote                       |   23             23     1.6s
  extend                           |   25             25     1.7s
  gc                               |  101            101     3.5s
  external                         |    6              6     0.0s
  swmr                             |    4              4     7.7s
  mmap                             |    9              9     1.0s
  properties                       |   42       2     44     1.7s
  filter                           |   48       1     49    13.7s
  Raw Chunk I/O                    |   52             52     4.4s
  fileio                           |    6              6     1.7s
  track order                      |   18             18     2.4s
  h5f_get_dset_no_attrs_hint       |    6              6     0.3s
  non-allocating methods           |   11       1     12     1.2s
  Compression Filter Unit Tests    |    6              6     0.2s
  Object API                       |   38             38     0.7s
  virtual dataset                  |    5              5     0.9s
  API Lock Preference              |    5              5     0.6s
     Testing HDF5 tests passed

That suggests the tests with the lock in place could take 15% longer than without the lock in place. I think that's at the high end. Over a number of tests it looks like 5% with a range of 2% to 16%.

mkitti · 2022-11-18T06:59:53Z

Should we bump to 0.17 if we merge this? It does not look directly breaking to me, but it is a significant change that it could be.

musm · 2022-11-21T18:58:32Z

I think we can probably keep things at 0.16, this PR should be a strict improvement (sans minor speed regression)

add lock around all API calls

64f90f9

simonbyrne force-pushed the sb/lock branch from e687868 to 64f90f9 Compare October 27, 2022 20:49

simonbyrne mentioned this pull request Oct 27, 2022

Crash when using from multiple threads #835

Closed

simonbyrne added the multithreading label Oct 28, 2022

mkitti reviewed Oct 31, 2022

View reviewed changes

mkitti approved these changes Nov 18, 2022

View reviewed changes

mkitti mentioned this pull request Nov 18, 2022

Merge strategy #1025

Open

musm merged commit b816d88 into master Nov 21, 2022

musm deleted the sb/lock branch November 21, 2022 18:58

mkitti mentioned this pull request Nov 23, 2022

API Lock Preferences #1024

Open

This was referenced Jan 19, 2023

Finalizers cause exception_access_violation while using HDF5 from a thread. #990

Closed

thread safety? #259

Closed

safe multi-threaded writing #853

Closed

simonbyrne mentioned this pull request Feb 2, 2023

Finalizer error when rapidly writing to HDF5 #1048

Closed

chrisbrahms mentioned this pull request Oct 19, 2023

Scratch spaces also for PPT cache/fix PIDlocks LupoLab/Luna.jl#349

Merged

lazarusA mentioned this pull request Oct 16, 2024

Add thread-safety note for NetCDF4 JuliaDataCubes/YAXArrays.jl#451

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add lock around all API calls #1021

add lock around all API calls #1021

simonbyrne commented Oct 27, 2022 •

edited

Loading

mkitti commented Oct 27, 2022

mkitti commented Oct 27, 2022

simonbyrne commented Oct 27, 2022

simonbyrne commented Oct 31, 2022

mkitti left a comment

mkitti Oct 31, 2022

simonbyrne Nov 8, 2022 •

edited

Loading

mkitti Nov 8, 2022

simonbyrne Nov 8, 2022

mkitti Oct 31, 2022

mkitti commented Nov 8, 2022

simonbyrne commented Nov 8, 2022

mkitti commented Nov 8, 2022

musm commented Nov 9, 2022

mkitti commented Nov 9, 2022

mkitti commented Nov 10, 2022

mkitti commented Nov 18, 2022

musm commented Nov 21, 2022

add lock around all API calls #1021

add lock around all API calls #1021

Conversation

simonbyrne commented Oct 27, 2022 • edited Loading

mkitti commented Oct 27, 2022

mkitti commented Oct 27, 2022

simonbyrne commented Oct 27, 2022

simonbyrne commented Oct 31, 2022

mkitti left a comment

Choose a reason for hiding this comment

mkitti Oct 31, 2022

Choose a reason for hiding this comment

simonbyrne Nov 8, 2022 • edited Loading

Choose a reason for hiding this comment

mkitti Nov 8, 2022

Choose a reason for hiding this comment

simonbyrne Nov 8, 2022

Choose a reason for hiding this comment

mkitti Oct 31, 2022

Choose a reason for hiding this comment

mkitti commented Nov 8, 2022

simonbyrne commented Nov 8, 2022

mkitti commented Nov 8, 2022

musm commented Nov 9, 2022

mkitti commented Nov 9, 2022

mkitti commented Nov 10, 2022

mkitti commented Nov 18, 2022

musm commented Nov 21, 2022

simonbyrne commented Oct 27, 2022 •

edited

Loading

simonbyrne Nov 8, 2022 •

edited

Loading