Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an API to fetch blobs from a given batch header hash #688

Merged
merged 8 commits into from
Aug 14, 2024

Conversation

dmanc
Copy link
Contributor

@dmanc dmanc commented Aug 7, 2024

Why are these changes needed?

Adds a new endpoint to the data api server to fetch blob metadata from a batch header hash. This API should enable others to build blob explorers on top of our data.

The endpoint is paginated and returns a next_token field which is a base64 encoded token of the last evaluated key in the page.

Example:
Sample query:

 curl -X 'GET' \
  'http://localhost:8080/api/v1/feed/batches/6E2EFA6EB7AE40CE7A65B465679DE5649F994296D18C075CF2C490564BBF7CA5/blobs?limit=1' \  
  -H 'accept: application/json

Sample response

{"meta":{"size":1,"next_token":"eyJCbG9iSGFzaCI6ImRkMjZjYTk3YmNlNzhhYTk2ZTAxMGUwMzNhY2ZjNDZmOWRhYTEwODVjZDVhMzg1MTRiNTAyNzUwZjU5YjlkMTMiLCJNZXRhZGF0YUhhc2giOiIzMTM3MzIzMjMzMzgzODM4MzkzMzMwMzkzNzMyMzMzMzM3MzczNjJmMzAyZjMzMzMyZjMxMmYzMzMzMmZlM2IwYzQ0Mjk4ZmMxYzE0OWFmYmY0Yzg5OTZmYjkyNDI3YWU0MWU0NjQ5YjkzNGNhNDk1OTkxYjc4NTJiODU1IiwiQmF0Y2hIZWFkZXJIYXNoIjoiYmk3NmJyZXVRTTU2WmJSbFo1M2xaSitaUXBiUmpBZGM4c1NRVmt1L2ZLVT0iLCJCbG9iSW5kZXgiOjB9"},"data":[{"blob_key":"dd26ca97bce78aa96e010e033acfc46f9daa1085cd5a38514b502750f59b9d13-313732323338383839333039373233333737362f302f33332f312f33332fe3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855","batch_header_hash":"6e2efa6eb7ae40ce7a65b465679de5649f994296d18c075cf2c490564bbf7ca5","blob_index":0,"signatory_record_hash":"3dfc70eafe5215ca1051a74e3609bb1d160e1f99005fc13d8c86c539d9114f7b","reference_block_number":2035851,"batch_root":"d9a658a5646d8d41fcea70afd9339d8d8b9041050c1ad2b2f4abe36f45613d2a","blob_inclusion_proof":"64bcc95c806f5d656192872c558988e9ec83d22ec1e000c5a1f13e31e100c8aa700baf3003f5453f2dbf13a9d5d12024d2a55c705ea225c81693145c2cc13fd78216d5479cda9f844af7864a92a42c23333290122fbc811bd604d9f9a9784a66dcaf7c45e683d49118884ebbb397d59b01e2dee2fc97b902c33ad712a276cfb5003ab3aa93c3ae8ab3b4132e0be40d87fb4f58cc5e49d4186e8b8831c014a6e2618bd5f8fdbbb6bccc68df2c04e1db82e8be1fadde0450cfdaf1c34263b92c7fe11f108590db529ec088a0a7ed08f9b2edbb799c85da41654e2f28317411032644096cdc49dd590f3f89d99ade73cb7c2347feb09d49c9450dc26a2fdc37c713c17eab2130c9bc5f84734ea6ccca24ce6f0339017d2877661858f2808b41aca94c86b4810e9fad5f48deb82578e6c2caa165fa64d3c4e49875bd0d91096997987bc6c3d811864bf293dbb728f9f436c5db38eec6ee6008ed82801254eabd178816f60734844264a7f53675b60d0ed1872987854c4d98ec09b9facfe59239bdff495d297de4fdccf4cc0e6dba7772c2880feee533993c1defe1c3d39bce42534cdca435e988da10ec5384b3ba21d2863a9b8f701a9405b7fa631f817f3326d04f32d4f17265cd8c17fe4edf564f7d2818c5c97baf1b39f30316b3d2f3ee53d051","blob_commitment":{"commitment":{"X":"10654988599870587770787393200552110054687298615607967764937097592038258411754","Y":"18734666437462819052584172468786552309751381314655805698672660714753147273453"},"length_commitment":{"X":{"A0":"5061480280063149801552916064155196935940417917920806083877037326616821363934","A1":"14062936678877092061763909390797389513408830393357964905007718131886911533835"},"Y":{"A0":"5745003969301824632454161959921145214502865986519377269450276040077026980010","A1":"6074252507891203505580637063317121427461816738102068256590025258852140830928"}},"length_proof":{"X":{"A0":"21399264865984950761164626890540180632049418151916889200045365655764199070011","A1":"17196068782888244615236694895883064928167854490786996353102920417690165216790"},"Y":{"A0":"17448100457771843994069640177381667809841110272430530510830195951164391042871","A1":"10444118721190278594390330487379515821747120086963435285720236784527226967237"}},"length":54},"batch_id":27961,"confirmation_block_number":2036016,"confirmation_txn_hash":"0xdfa70b7fbcedb568a792b495f0c74ba84c89519399e31287c839fb0b36871337","fee":"00","security_params":[{"QuorumID":0,"AdversaryThreshold":33,"ConfirmationThreshold":55,"QuorumRate":3072},{"QuorumID":1,"AdversaryThreshold":33,"ConfirmationThreshold":55,"QuorumRate":3072}],"requested_at":1722388893,"blob_status":3}]}

next_token decoded:

{"BlobHash":"dd26ca97bce78aa96e010e033acfc46f9daa1085cd5a38514b502750f59b9d13","MetadataHash":"313732323338383839333039373233333737362f302f33332f312f33332fe3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855","BatchHeaderHash":"bi76breuQM56ZbRlZ53lZJ+ZQpbRjAdc8sSQVku/fKU=","BlobIndex":0}

Checks

  • I've made sure the lint is passing in this PR.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, in that case, please comment that they are not relevant.
  • Testing Strategy
    • Unit tests
    • Integration tests
    • This PR is not tested :(

@dmanc dmanc requested review from jianoaix and ian-shim August 7, 2024 05:48
@dmanc dmanc marked this pull request as ready for review August 7, 2024 05:54
Copy link
Contributor

@ian-shim ian-shim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm

disperser/dataapi/server.go Outdated Show resolved Hide resolved
disperser/dataapi/server.go Show resolved Hide resolved
disperser/dataapi/server.go Show resolved Hide resolved
@@ -35,6 +35,23 @@ func (s *server) getBlobs(ctx context.Context, limit int) ([]*BlobMetadataRespon
return s.convertBlobMetadatasToBlobMetadataResponse(ctx, blobMetadatas)
}

func (s *server) getBlobsFromBatchHeaderHash(ctx context.Context, batcherHeaderHash string, limit int, exclusiveStartKey *disperser.BatchIndexExclusiveStartKey) ([]*BlobMetadataResponse, *disperser.BatchIndexExclusiveStartKey, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks this can be used to re-implement (hopefully simplify) the above getBlobs function?

}
}

queryResult, err := s.dynamoDBClient.QueryIndexWithPagination(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this return error if 1mb limit is reached?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it simply just returns less items than the limit.

A single Query operation will read up to the maximum number of items set (if using the Limit parameter) or a maximum of 1 MB of data

Value: batchHeaderHash[:],
},
},
limit,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If limit is large and exceeds 1mb, can this be broken down into multiple calls so it get all desired number of blobs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just following the existing pattern with this API which is to make the client keep track of the limit / amount of items returned and continuously call for more pages.

var allMetadata []*disperser.BlobMetadata
var nextKey *disperser.BatchIndexExclusiveStartKey = exclusiveStartKey

const maxLimit int32 = 1000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A batch can have more than 1000 blobs easily (just 2 blobs/s)

Copy link
Contributor Author

@dmanc dmanc Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah i've found a batch that has 16497 blobs in testnet. I'm setting some reasonable max limit on the page size. Do you think 10,000 is a better upper bound?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying with 10,000 takes around 13 seconds with a 27 megabyte response. While 1000 takes less than 2 seconds with 2 megabyte response.

I think it's better with smaller pages since it can also be parallelized.

@dmanc dmanc requested review from jianoaix, ian-shim and anupsv August 8, 2024 06:51
Copy link
Contributor

@ian-shim ian-shim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Can you make sure the cloudflare cache is enabled for all endpoints?

@dmanc
Copy link
Contributor Author

dmanc commented Aug 14, 2024

lgtm. Can you make sure the cloudflare cache is enabled for all endpoints?

I'll track it as a separate item

@dmanc dmanc merged commit e7b916f into Layr-Labs:master Aug 14, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants