Skip to content

Conversation

noel2004
Copy link
Member

@noel2004 noel2004 commented Oct 3, 2025

Fix an issue that the new checkCompressedDataCompatibilityV7 do not consider RLE type block, which would be possible when we have switched to official zstd for data compression

Summary by CodeRabbit

  • Bug Fixes
    • Improved validation of compressed data blocks to reduce decoding failures.
    • Correctly interprets block types, enforcing proper size for RLE blocks and rejecting reserved types with clear errors.
    • Provides more accurate data length checks and error reporting, improving reliability when processing V7-compressed inputs.

Copy link

coderabbitai bot commented Oct 3, 2025

Walkthrough

Adds blkType derivation in checkCompressedDataCompatibilityV7, branching handling for block types: forces blkSize=1 for RLE (type 1), errors on reserved (type 3), and otherwise computes blkSize as before. Control flow adjusts data-length validation and block advancement based on blkType.

Changes

Cohort / File(s) Summary
Block-type handling in V7 compatibility check
encoding/da.go
Derive blkType from header; special-case RLE (blkSize=1) and reserved type (error); retain last-block logic; adjust validation and advancement accordingly.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Caller
  participant F as checkCompressedDataCompatibilityV7
  participant H as Header Parser

  C->>F: validate compressed data (v7)
  F->>H: read header
  H-->>F: blkType, header fields

  alt blkType == 1 (RLE)
    note right of F: Force blkSize = 1
    F->>F: validate data length
    F->>F: advance to next block / check last-block
  else blkType == 3 (reserved)
    F-->>C: return error (reserved block type)
  else other types
    F->>F: compute blkSize from header
    F->>F: validate data length
    F->>F: advance / check last-block
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • colinlyguo
  • Thegaram
  • jonastheis

Poem

I nibble bytes with tidy care,
RLE gets one—I’m well aware.
Type three? I thump: “Beware, verboten!”
The stream hops on, no bits forgotten.
With whiskered logic, swift and neat,
This bunny makes your headers sweet. 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Title Check ⚠️ Warning The title uses bracket notation instead of the required conventional-commit prefix, contains a typo in “Compabilitycheck,” and does not clearly convey the main fix in a concise, conventional-commit format. Rename the title to follow conventional commits and clearly describe the change, for example: “fix: support RLE block type in checkCompressedDataCompatibilityV7”.
Description Check ⚠️ Warning The description does not adhere to the repository’s template: it lacks the “Purpose or design rationale” section with clear what/why/how answers, the conventional-commit formatted title, and the breaking-change label. Populate the description using the template by adding a detailed Purpose or design rationale section, including a conventional-commit PR title, and indicating whether the change is breaking.
✅ Passed checks (1 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/da-compabilitycheck

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
encoding/da.go (1)

498-507: LGTM! RLE and reserved block handling added correctly.

The logic correctly handles RLE blocks (type 1) by setting blkSize = 1, and properly rejects reserved blocks (type 3). The block type extraction and branching logic are accurate.

However, consider this minor improvement to the error message at line 504:

-			return fmt.Errorf("encounter reserved block type at %v", data)
+			return fmt.Errorf("encounter reserved block type (3) at offset %d", len(originalData)-len(data))

This would show the position rather than dumping potentially large data arrays. To implement this, you'd need to track the original data length before the loop.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b4cce5c and 4e9428c.

📒 Files selected for processing (1)
  • encoding/da.go (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-10-18T03:40:09.800Z
Learnt from: colinlyguo
PR: scroll-tech/da-codec#25
File: encoding/codecv1_types.go:105-116
Timestamp: 2024-10-18T03:40:09.800Z
Learning: The code in `encoding/codecv1_types.go`, specifically the `Encode` method in `daBatchV1`, has been updated. Previous comments regarding hardcoded byte offsets may be outdated.

Applied to files:

  • encoding/da.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: tests
🔇 Additional comments (1)
encoding/da.go (1)

500-507: Raw blocks (type 0) are valid per ZSTD v7 frame spec Raw (uncompressed) blocks are explicitly defined in the format and must be supported, so accepting type 0 here is correct.

@noel2004 noel2004 requested a review from Thegaram October 3, 2025 01:13
@Thegaram Thegaram requested a review from jonastheis October 3, 2025 06:37
Copy link
Contributor

@jonastheis jonastheis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change should be relatively low risk as we decode batches before submitting them in the rollup relayer anyway.

As discussed in Slack, we should probably replace this check long-term with a decompression with the underlying library to avoid such compatibility issues and be better maintainable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants