Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python] for an Experiment, soma.open is 3X slower than soma.Experiment.open #2726

Open
bkmartinjr opened this issue Jun 12, 2024 · 3 comments

Comments

@bkmartinjr
Copy link
Member

On a TileDB Cloud URI that points to a SOMA Experiment, soma.open is substantially slower than soma.Experiment.open.

Ideally these would be the of similar performance, allowing the convenience of using the generic opener.

The use of TileDB-Py (e.g., tiledb.Group()) is even faster yet, and ideally would be the benchmark time.

Example:

In [40]: soma.__version__
Out[40]: '1.9.5'

In [41]: tiledb.__version__
Out[41]: '0.27.1'

In [52]: %time print(list(soma.open("tiledb://TileDB-Inc/8917a1ab-dd51-44b4-999c-cce5321adcdf", context=ctx)))
['ms', 'obs']
CPU times: user 89 ms, sys: 23.9 ms, total: 113 ms
Wall time: 3.4 s

In [53]: %time print(list(soma.Experiment.open("tiledb://TileDB-Inc/8917a1ab-dd51-44b4-999c-cce5321adcdf", context=ctx)))
['ms', 'obs']
CPU times: user 41.5 ms, sys: 9.12 ms, total: 50.6 ms
Wall time: 1.12 s

In [54]: %time print(list(tiledb.Group("tiledb://TileDB-Inc/8917a1ab-dd51-44b4-999c-cce5321adcdf", ctx=tiledb.cloud.Ctx())))
[Obj<GROUP "tiledb://TileDB-Inc/9b79dd7e-5f10-46fe-a229-00a40be387f1" - "ms">, Obj<ARRAY "tiledb://TileDB-Inc/ccc2103f-2950-4dbf-9fc8-568fb59e01dc" - "obs">]
CPU times: user 39.8 ms, sys: 7.98 ms, total: 47.8 ms
Wall time: 871 ms
In [59]: soma.show_package_versions()
tiledbsoma.__version__        1.9.5
TileDB-Py tiledb.version()    (0, 27, 1)
TileDB core version           2.21.1
libtiledbsoma version()       libtiledb=2.21.1
python version                3.10.12.final.0
OS version                    Linux 6.8.0-76060800daily20240311-generic
@johnkerl johnkerl self-assigned this Jun 12, 2024
@johnkerl johnkerl changed the title [python] for an Experiment, soma.open is 3X slower than soma.Experiment.open [python] for an Experiment, soma.open is 3X slower than soma.Experiment.open Jun 13, 2024
@johnkerl
Copy link
Member

On investigation:

  • The polymorphic opener asks for object->type() (array or group) which incurs some unnecessary expense
  • There are some potential optimizations we can do here in collaboration with the core and cloud teams
  • More info to come

@bkmartinjr
Copy link
Member Author

FYI: @ypatia - latency-related (this case is not proxied, but demonstrates another access pattern that is latency-sensitive)

@johnkerl
Copy link
Member

johnkerl commented Jul 1, 2024

@ypatia @bkmartinjr I have a write-up not written down yet -- fully analyzed in my head & in scratch notes -- will follow up post-retreat ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants