-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmarking #57
Comments
I think benchmarking compression algorithms might be interesting. However, there are already several benchmarks about this on the web; the only additional information we would get it's if some algorithms might perform better in images, which might yield different performances in fluorescent vs. brightfield images. I have mixed feelings about benchmarking chunk size, I think it's more processing step/dataset dependent, and a good benchmark will be very difficult to integrate into the CI, because it would require different storages and applications. I'm happy with |
I haven't been exposed to frameworks other than There was some manual benchmarking results done in the time of It's interesting to see how sparsity and patterns of BF, fluorescence, and mixed images will affect the results, and we can potentially recommend different compression schemes for different datasets. By @camFoltz: Compression level 1: Compression level 9: |
Are you thinking of benchmarks that should be run during CI to catch performance gains or drops? If yes, I'd suggest timing the write and read operations for a 1GB random array so we can evaluate how using different dependencies (zarr-python vs tensorstore) affects the performance when using a single process or multiple processes. Separately, we do need to know io performance as a function of chunk size and as a function of compression for our HPC infrastructure, specifically when using ESS or scratch space from the compute nodes. This is needed to make sound choices for how to run different pipelines. These benchmarks need not (should not) run on the CI servers. Also useful to evaluate the speed and compression ratios for different modalities of data. That also need not run during CI. |
Got it, @mattersoflight . I was thinking about a benchmark with CI. |
We need to set up a benchmarking infra which we can run on different contexts and get help to make more educated decisions related to the performance. I'm inclined to use asv: https://github.com/airspeed-velocity/asv, but first wanted to hear more opinions. I want to hear your opinions on both benchmarking frameworks and what aspects of iohub you wish to see in benchmarks.
The text was updated successfully, but these errors were encountered: