Skip to content

Conversation

@lidel
Copy link
Member

@lidel lidel commented Dec 9, 2025

This PR closes #8794, but imo is just a formality.

In practice, unlikely this will impact anyone, DAGs with the same block being addressed multiple times with different codec is virtually never happening.

Rationale

since Kubo v0.12.0, blocks are stored by multihash, so identical data with different CIDs (e.g., CIDv0 vs CIDv1) is stored once. dag stat now reflects actual storage by using multihash-based deduplication instead of CID-based.

updated help text to clarify deduplication behavior and note that CAR export (dag export) uses CID-based keying and may include duplicates. this addresses concern raised in #8843 (comment)

added regression test for multihash deduplication.

since Kubo v0.12.0, blocks are stored by multihash, so identical
data with different CIDs (e.g., CIDv0 vs CIDv1) is stored once.
dag stat now reflects actual storage by using multihash-based
deduplication instead of CID-based.

updated help text to clarify deduplication behavior and note that
dag export uses CID-based keying and may include duplicates.

added regression test for multihash deduplication.
@lidel lidel mentioned this pull request Dec 9, 2025
34 tasks
@lidel lidel self-assigned this Dec 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ipfs dag stat improperly return size of dags that include the same blocks multiple time as different codecs

2 participants