feat(lapis2): stream data from SILO #745
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
resolves #744
PR Checklist
[ ] All necessary documentation has been adapted.Summary
The streaming still happens in the same thread as the request handling (by not using
StreamingResponseBody
). We'll have to observe whether this causes any issues in the future. I couldn't properly get usingStreamingResponseBody
work with the way we implemented compression. We have to close the compressions streams here, but then the lazy stream can't write anymore.I profiled LAPIS with two datasets.
TLDR
Ebola
I executed this a couple of times with the data of this branch and observed the memory consumption:
Memory consumption on main:
Memory consumption on this branch:
The open Covid dataset on the testserver
I connected my local LAPIS to the SILO running on the server and downloaded metadata:
This is the memory consumption on the current main:
Also the download was incredibly slow. It took 3.5 minutes to downlaod 426 MB:
This is the same for the current branch:
The download isn't fast, but still almost twice as fast:
Note: The SILO version that this has been tested with does not stream any data lazily yet. It still loads the full result into memory. Only LAPIS handles the response stream from SILO lazily.