Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chainstore: Fix raw blocks getting scanned for links during snapshots #10684

Merged
merged 1 commit into from
Apr 21, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 21 additions & 6 deletions chain/store/snapshot.go
Original file line number Diff line number Diff line change
Expand Up @@ -371,6 +371,9 @@ func (s *walkScheduler) enqueueIfNew(task walkTask) {
//log.Infow("ignored", "cid", todo.c.String())
return
}

// This lets through RAW and CBOR blocks, the only two types that we
// end up writing to the exported CAR.
if task.c.Prefix().Codec != cid.Raw && task.c.Prefix().Codec != cid.DagCBOR {
//log.Infow("ignored", "cid", todo.c.String())
return
Expand Down Expand Up @@ -442,10 +445,19 @@ func (s *walkScheduler) processTask(t walkTask, workerN int) error {
b: blk,
}

// We exported the ipld block. If it wasn't a CBOR block, there's nothing
// else to do and we can bail out early as it won't have any links
// etc.
if t.c.Prefix().Codec != cid.DagCBOR || t.c.Prefix().MhType == mh.IDENTITY {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it likely doesn't matter (at the moment, we should scan for links in "identity" blocks. In theory, I can inline dag-cbor into an identity hash.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Stebalien I never knew the reasons to skip identity hashes, it was something inherited from @frrist code. I can't say if there's an instance of an identity hash in the whole chain history (? genesis or something?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. So, we shouldn't include blocks for identity hashes, but we should technically be traversing them and including their children, if any.

On the other hand, I can guarantee that this won't make a difference in practice (at the moment, at least). So we should be good to merge this as-os.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, happy to send follow-up in a few days.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw I inherited it from:

// Don't include identity CIDs.

return nil
}

rawData := blk.RawData()

// extract relevant dags to walk from the block
if t.taskType == blockTask {
var b types.BlockHeader
if err := b.UnmarshalCBOR(bytes.NewBuffer(blk.RawData())); err != nil {
if err := b.UnmarshalCBOR(bytes.NewBuffer(rawData)); err != nil {
return xerrors.Errorf("unmarshalling block header (cid=%s): %w", blk, err)
}
if b.Height%1_000 == 0 {
Expand Down Expand Up @@ -521,11 +533,7 @@ func (s *walkScheduler) processTask(t walkTask, workerN int) error {
}

// Not a chain-block: we scan for CIDs in the raw block-data
return cbg.ScanForLinks(bytes.NewReader(blk.RawData()), func(c cid.Cid) {
if t.c.Prefix().Codec != cid.DagCBOR || t.c.Prefix().MhType == mh.IDENTITY {
return
}

err = cbg.ScanForLinks(bytes.NewReader(rawData), func(c cid.Cid) {
s.enqueueIfNew(walkTask{
c: c,
taskType: dagTask,
Expand All @@ -534,6 +542,13 @@ func (s *walkScheduler) processTask(t walkTask, workerN int) error {
epoch: t.epoch,
})
})

if err != nil {
return xerrors.Errorf(
"ScanForLinks(%s). Task: %s. Block: %s (%s). Epoch: %d. Err: %w",
t.c, t.taskType, t.topLevelTaskType, t.blockCid, t.epoch, err)
}
return nil
}

func (cs *ChainStore) ExportRange(
Expand Down