Encrypt and compress read data #63

morungos · 2014-05-29T13:28:48Z

Currently, both the pipeline and the webapp use reads at the record level. This is good for fine-grained access, but not ideal. We really should move to a bucketed/compressed/encrypted model, with (say) packets of 5k reads compressed and encrypted.

If we keep this relatively small, there won't be huge penalty accessing a single read. There may even be a performance improvement as we reduce the disk, I/O, and index sizes, which is actually likely.

This issue affects both the pipeline and the webapp, as the pipeline writes the data and the webapp reads it. So both Python and Java need to agree on data storage and compression systems. See: capsid/capsid-pipeline#8

morungos self-assigned this May 29, 2014

morungos mentioned this issue May 29, 2014

Encrypt and compress read data capsid/capsid-pipeline#8

Open

morungos added the Backlog label Jun 6, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Encrypt and compress read data #63

Encrypt and compress read data #63

morungos commented May 29, 2014

Encrypt and compress read data #63

Encrypt and compress read data #63

Comments

morungos commented May 29, 2014