Skip to content
Hitenjain14 edited this page Mar 18, 2025 · 2 revisions

Blobber Repair Protocol

Overview

The repair process ensures that all blobbers in a decentralized storage allocation maintain a consistent version of data. This is necessary when:

  • A blobber misses a commit, leading to data inconsistency.
  • A user adds or replaces a blobber in the storage allocation.

Since data is erasure encoded using Reed-Solomon coding, we can recover the original data as long as at least data_shards number of blobbers have the correct data. The repair process synchronizes all blobbers to the same allocation root, ensuring consistency and integrity.


Problem Statement

Decentralized storage relies on multiple independent blobbers to store data redundantly using erasure encoding. However, due to failures, inconsistencies arise:

  1. A blobber may have missed a commit, leading to data mismatches.
  2. When a new blobber is added or replaced, it starts with an empty or outdated state.
  3. Data integrity needs to be enforced by ensuring all blobbers maintain the same allocation root, representing the latest version of stored data.

To resolve this, a structured repair process is required to restore all blobbers to the same version.


Repair Process

1. Allocation Root Consensus

  • The client fetches the allocation roots from all participating blobbers.
  • The client groups blobbers into sets based on their allocation roots.
  • The set with at least data_shards blobbers that share the same allocation root is considered the master set.
  • Blobbers not in the master set are secondary blobbers that require repair.

2. File Synchronization Using a Lead Blobber

  • A lead blobber is chosen from each set to act as a representative.
  • The lead blobber lists all files in a paginated manner.
  • The client processes file lists using a diff function to determine:
    1. Missing Files: Files present in the master set but absent in secondary blobbers.
    2. Extra Files: Files present in secondary blobbers but missing in the master set.
    3. Modified Files: Files with mismatched file hashes, indicating a need for update.
  • Based on this analysis, file operations are queued for execution.

3. Repair Execution

  • Batch processing is used for high throughput.
  • Files requiring repair are downloaded from the master set and uploaded to secondary blobbers.
  • Pipelining: Data is streamed from the master set directly to secondary blobbers, preventing disk writes and maximizing throughput.
  • The repair process iterates until all files are processed.

4. Ensuring Synchronization

  • Once all files are synchronized, all blobbers should have the same allocation root as the master set.
  • This ensures that all blobbers in the allocation are fully synchronized and maintain data consistency.

UML Diagram: Blobber Repair Process (4 Data + 2 Parity, 2 Sets)

sequenceDiagram
    participant SDK
    participant LeadBlobberMaster(Blobber1) as Lead Blobber (Master Set - Root A)
    participant LeadBlobberSecondary(Blobber5) as Lead Blobber (Secondary Set - Root B)
    participant Blobber2 as Blobber 2 (Master Set - Root A)
    participant Blobber3 as Blobber 3 (Master Set - Root A)
    participant Blobber4 as Blobber 4 (Master Set - Root A)
    participant Blobber6 as Blobber 6 (Secondary Set - Root B)
    participant DiffFunction as Diff Function
    participant Executor as Operation Executor

    %% Step 1: Fetch Allocation Roots and Consensus
    SDK->>LeadBlobberMaster(Blobber1): Fetch Allocation Root (A)
    SDK->>Blobber2: Fetch Allocation Root (A)
    SDK->>Blobber3: Fetch Allocation Root (A)
    SDK->>Blobber4: Fetch Allocation Root (A)
    SDK->>LeadBlobberSecondary(Blobber5): Fetch Allocation Root (B)
    SDK->>Blobber6: Fetch Allocation Root (B)

    SDK->>SDK: Take Consensus (4 Data Shards)
    SDK->>SDK: Form Master Set (Blobber1-4, Root A) & Secondary Set (Blobber5-6, Root B)

    loop Paginated File Listing
        SDK->>LeadBlobberMaster(Blobber1): List Files (Paginated)
        SDK->>LeadBlobberSecondary(Blobber5): List Files (Paginated)

        SDK->>DiffFunction: Compare Files Across Master (A) & Secondary (B)
        DiffFunction->>SDK: Return Batch of Repair Operations
        SDK->>Executor: Process Batch of Operations
    end

    SDK->>SDK: Repeat Until All Files Are Processed
    SDK->>SDK: All Blobbers Now Have Same Allocation Root (A)
Loading