Skip to content

Fault Tolerance vs Performance

Bill Katz edited this page Jul 8, 2016 · 4 revisions

DVID is being developed as a tool for scientific research at Janelia. Our workflows can be tailored to achieve good performance while avoiding issues sidestepped by DVID in order to make code simpler or more performant. This page tries to document tradeoffs made during the development of DVID, both as items for review when we have more manpower as well as being explicit about edge cases where DVID may act unexpectedly.

In the near future, we plan on implementing a mutation log where each mutation gets a unique mutation ID. Denormalizations will be driven off this mutation log using the internal pub/sub system; each data instance can subscribe to changes in other data instances, allowing an eventually consistent view of data.

Properties of data instances

Data instances may have mutable properties such as "spatial extents" of a labelblk or "maximum label" of a labelvol. When putting new data, it can be quite expensive to save these properties on every mutation of the underlying data instance or use transactions to guarantee the properties are transactionally committed with respect to data changes.

  • imageblk and labelblk Extents are not transactionally changed with new puts.

Possible solutions:

  • Use same approach as MaxLabel in labelvol, i.e. thread-safe cache and persistence batched with associated mutations.

Synchronization of different data views

Data instances can be synchronized and provide different views of the same data. For example, a labelblk, labelvol, and labelsz instance could be synchronized so that changes to label voxels will set off changes in label sparse volumes and size indices. These synchronizations currently are not transactional, so server failure while writing synchronized data (and before associated data is updated) can lead to an inconsistent state between these views.

Possible solutions:

  • Redo changes that were made before server failure. (Current approach.)
  • Keep log that will allow reestablishment of consistency on server restart.
Clone this wiki locally