You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, both the pipeline and the webapp use reads at the record level. This is good for fine-grained access, but not ideal. We really should move to a bucketed/compressed/encrypted model, with (say) packets of 5k reads compressed and encrypted.
If we keep this relatively small, there won't be huge penalty accessing a single read. There may even be a performance improvement as we reduce the disk, I/O, and index sizes, which is actually likely.
This issue affects both the pipeline and the webapp, as the pipeline writes the data and the webapp reads it. So both Python and Java need to agree on data storage and compression systems. See: capsid/capsid-pipeline#8
The text was updated successfully, but these errors were encountered:
Currently, both the pipeline and the webapp use reads at the record level. This is good for fine-grained access, but not ideal. We really should move to a bucketed/compressed/encrypted model, with (say) packets of 5k reads compressed and encrypted.
If we keep this relatively small, there won't be huge penalty accessing a single read. There may even be a performance improvement as we reduce the disk, I/O, and index sizes, which is actually likely.
This issue affects both the pipeline and the webapp, as the pipeline writes the data and the webapp reads it. So both Python and Java need to agree on data storage and compression systems. See: capsid/capsid-pipeline#8
The text was updated successfully, but these errors were encountered: