Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix log range filter #445

Merged
merged 26 commits into from
Aug 16, 2024
Merged

Bugfix log range filter #445

merged 26 commits into from
Aug 16, 2024

Conversation

sideninja
Copy link
Contributor

@sideninja sideninja commented Aug 16, 2024

Closes: #439

Description

The storage implementation for bloom filters incorrectly defined the range which resulted in additional logs being returned.

Because we've used big.Int type for heights, more specifically big.Int.Bytes() which returned byte array that only contained non-empty values in big-endian encoding, it allowed for multiple other values to match the pebble itteration upper/lower bound filter.

Simply said, if we wanted to get height 1 when converted using Bytes() we got a slice []byte{0x1} and that was used as boundary in iterations, but that also matched other values starting with same byte (like []byte{0x01, 0x02} or []byte{0x01, 0xf, 0xa}). We changed usage of big.Int to uint64 since it doesn't really make sense in storage layer to use big.Int, and we changed to encoding of uint64 to fixed length of 8 bytes. So the above example of height 1 would be defined as []byte{0x1, 0x0, 0x0, 0x0} thus no other value would match it.


For contributor use:

  • Targeted PR against master branch
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Code follows the standards mentioned here.
  • Updated relevant documentation
  • Re-reviewed Files changed in the Github PR explorer
  • Added appropriate labels

Summary by CodeRabbit

  • New Features

    • Improved handling of block heights and log retrieval with simplified data types (from *big.Int to uint64).
    • Enhanced filtering logic in the logs package to optimize receipt fetching based on block heights.
  • Bug Fixes

    • Ensured consistent return types in log retrieval methods, preventing potential nil reference errors.
  • Documentation

    • Updated method signatures and parameter names for clarity and consistency across interfaces.
  • Tests

    • Revised test cases to reflect changes in data types, enhancing robustness and performance of unit tests.

Copy link
Contributor

coderabbitai bot commented Aug 16, 2024

Caution

Review failed

The pull request is closed.

Walkthrough

The recent changes enhance the blockchain API's log retrieval functionality by improving type handling and ensuring more robust error management. Key modifications include transitioning from big.Int to uint64 for block height parameters, streamlining the codebase for better performance and clarity. Additionally, the handling of response structures ensures consistent outputs, reducing the risk of errors when retrieving logs, especially for early block numbers.

Changes

Files Change Summary
api/api.go, api/pull.go Updated GetLogs and getLogs methods to improve type handling for block heights, transitioning from big.Int to uint64, simplifying logic and enhancing performance.
models/receipt.go Adjusted BloomsHeight struct to use uint64 instead of *big.Int for height, improving memory efficiency and performance.
services/logs/filter.go, filter_test.go Changed RangeFilter to utilize uint64 for start and end, improving filtering logic and performance. Adjusted tests accordingly to reflect new parameter types.
storage/index.go, index_testsuite.go Updated BlockIndexer and ReceiptIndexer methods to replace *big.Int with uint64, enhancing clarity and consistency in method signatures.
storage/mocks/*.go Modified mocks to align with changes in parameter types from *big.Int to uint64, ensuring consistency across the mocked functions.
storage/pebble/*.go Transitioned from using big.Int to uint64 for height-related operations, streamlining code and improving performance across multiple methods.

Assessment against linked issues

Objective Addressed Explanation
Return only the logs from the specified range in the results (#[439])

🐰🌟 In the blockchain glade, where transactions hop,
Types simplified, and errors drop.
From big integers, we've made a flight,
Now uint64 shines, so clear and bright!
Logs will filter as they should, oh what a delight! 🌟🐰


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

PostState []byte `json:"root"`
Status uint64 `json:"status"`
CumulativeGasUsed uint64 `json:"cumulativeGasUsed"`
// todo we could skip bloom to optimize storage and dynamically recalculate it
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this todo is not really true since we store blooms in storage

for _, bloomHeight := range bloomsHeight {
for _, bloom := range bloomHeight.Blooms {
if !bloomMatch(*bloom, r.criteria) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was broken, probably got broken after changes, because it did nothing, so it didn't optimize next step.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 0a40149 and 40d2a43.

Files selected for processing (13)
  • api/api.go (2 hunks)
  • api/pull.go (2 hunks)
  • models/receipt.go (2 hunks)
  • services/logs/filter.go (4 hunks)
  • services/logs/filter_test.go (4 hunks)
  • storage/index.go (2 hunks)
  • storage/index_testsuite.go (3 hunks)
  • storage/mocks/BlockIndexer.go (4 hunks)
  • storage/mocks/ReceiptIndexer.go (6 hunks)
  • storage/mocks/mocks.go (2 hunks)
  • storage/pebble/accounts.go (1 hunks)
  • storage/pebble/receipts.go (9 hunks)
  • storage/pebble/storage_test.go (1 hunks)
Files skipped from review due to trivial changes (1)
  • storage/mocks/BlockIndexer.go
Additional context used
golangci-lint
storage/index_testsuite.go

342-342: SA4010: this result of append is never used, except maybe in other appends

(staticcheck)


343-343: SA4010: this result of append is never used, except maybe in other appends

(staticcheck)

Additional comments not posted (26)
storage/mocks/mocks.go (1)

65-65: Consider the impact of randomness on test determinism.

The addition of the Bloom field with random bytes may lead to non-deterministic test results. Ensure that tests using this mock are robust against such variability or consider using a fixed seed for randomness.

storage/pebble/accounts.go (1)

59-59: LGTM! Verify the correctness of the new logic.

The change to use txHeight for encoding the nonce aligns with the transaction context and seems appropriate. Ensure that this change does not introduce any unintended side effects in the nonce handling logic.

Verification successful

Verified: Consistent usage of txHeight in nonce encoding logic.

The use of txHeight is consistent within the storage/pebble/accounts.go file, and its role in encoding the nonce appears to be correctly implemented. There are no apparent side effects or inconsistencies in its usage.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the usage of `txHeight` in nonce encoding logic.

# Test: Search for the usage of `txHeight` in the codebase. Expect: Consistent usage in nonce encoding logic.
rg --type go 'txHeight'

Length of output: 209

storage/mocks/ReceiptIndexer.go (1)

Line range hint 20-119: LGTM! Verify the impact of type changes.

The transition from *big.Int to uint64 for block heights simplifies the code and should enhance performance. Ensure that all related code is updated to handle uint64 correctly and that there are no lingering dependencies on *big.Int.

Verification successful

No lingering dependencies on *big.Int for block heights found.

The transition from *big.Int to uint64 for block heights appears to be successfully implemented, as there are no remaining dependencies on *big.Int for this specific purpose in the codebase. Other uses of *big.Int pertain to different functionalities and do not impact the block height changes.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the impact of changing block height types to `uint64`.

# Test: Search for any lingering dependencies on `*big.Int` for block heights. Expect: No dependencies remain.
rg --type go 'big\.Int' --context 5

Length of output: 18936

models/receipt.go (1)

136-136: Transition Height to uint64 in BloomsHeight.

The change from *big.Int to uint64 for the Height field simplifies the handling of block heights and aligns with the PR objectives. Ensure that all usages of BloomsHeight are updated to handle uint64 correctly.

Verification successful

Transition to uint64 for Height in BloomsHeight is correctly implemented.

The change from *big.Int to uint64 for the Height field in BloomsHeight is handled correctly across all usages. The values are derived from uint64 sources, ensuring proper handling and alignment with the PR objectives.

  • services/logs/filter_test.go: Height is set using r.BlockNumber.Uint64().
  • storage/pebble/receipts.go: Height is set using binary.BigEndian.Uint64(height).

These usages confirm that the transition to uint64 is properly managed.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all usages of `BloomsHeight` to ensure they handle `uint64` correctly.

# Test: Search for the `BloomsHeight` struct usage. Expect: Correct handling of `uint64`.
rg --type go 'BloomsHeight'

Length of output: 1218


Script:

#!/bin/bash
# Extract code snippets around the usage of BloomsHeight to verify correct handling of uint64.

# Check usage in storage/index.go
rg --type go -A 5 'BloomsHeight' storage/index.go

# Check usage in services/logs/filter_test.go
rg --type go -A 5 'BloomsHeight' services/logs/filter_test.go

# Check usage in storage/pebble/receipts.go
rg --type go -A 5 'BloomsHeight' storage/pebble/receipts.go

# Check usage in storage/mocks/ReceiptIndexer.go
rg --type go -A 5 'BloomsHeight' storage/mocks/ReceiptIndexer.go

Length of output: 2110

storage/index.go (3)

48-48: Parameter renaming in SetLatestCadenceHeight.

The parameter height is renamed to cadenceHeight for clarity. This change improves the readability of the method signature.


54-61: Standardize parameter naming and type in interface methods.

The parameter height now consistently uses uint64 across methods, improving clarity and aligning with the PR objectives. Ensure that all implementations of these interfaces are updated accordingly.

Verification successful

Parameter Type Consistency Verified for Interface Methods

The uint64 type is consistently used for the height parameter in all implementations and mocks of the BlockIndexer and ReceiptIndexer interfaces. This aligns with the review comment's objective to standardize parameter naming and type. No further action is needed.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all implementations of `BlockIndexer` and `ReceiptIndexer` to ensure they handle `uint64` correctly.

# Test: Search for the implementations of `BlockIndexer` and `ReceiptIndexer`. Expect: Correct handling of `uint64`.
rg --type go 'BlockIndexer' -A 10
rg --type go 'ReceiptIndexer' -A 10

Length of output: 51989


Line range hint 69-86:
Update ReceiptIndexer methods to use uint64.

The methods Store, GetByBlockHeight, and BloomsForBlockRange now use uint64 for block heights, simplifying the code and improving performance. Verify that all method implementations handle uint64 correctly.

Verification successful

Verification Successful: ReceiptIndexer Methods Handle uint64 Correctly

The methods Store, GetByBlockHeight, and BloomsForBlockRange in the ReceiptIndexer interface have been verified to correctly handle uint64 for block heights in both their implementations and associated test cases.

  • Implementations in storage/pebble/receipts.go use uint64 for block heights.
  • Test cases in storage/index_testsuite.go also use uint64, ensuring proper testing of these methods.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all implementations of `ReceiptIndexer` methods to ensure they handle `uint64` correctly.

# Test: Search for the implementations of `ReceiptIndexer` methods. Expect: Correct handling of `uint64`.
rg --type go 'func (r \*Receipts) Store' -A 10
rg --type go 'func (r \*Receipts) GetByBlockHeight' -A 10
rg --type go 'func (r \*Receipts) BloomsForBlockRange' -A 10

Length of output: 171


Script:

#!/bin/bash
# Description: Search for the `ReceiptIndexer` interface and its method implementations to verify `uint64` usage.

# Search for the `ReceiptIndexer` interface definition.
rg --type go 'type ReceiptIndexer interface' -A 20

# Search for any implementation of `Store` method that might be related to `ReceiptIndexer`.
rg --type go 'Store\(' -A 10

# Search for any implementation of `GetByBlockHeight` method that might be related to `ReceiptIndexer`.
rg --type go 'GetByBlockHeight\(' -A 10

# Search for any implementation of `BloomsForBlockRange` method that might be related to `ReceiptIndexer`.
rg --type go 'BloomsForBlockRange\(' -A 10

Length of output: 39942

storage/pebble/receipts.go (3)

47-51: Simplify block height handling in Store.

The block height is now directly retrieved as a uint64, simplifying the code. Ensure that all related logic correctly handles uint64.


Line range hint 152-217:
Optimize BloomsForBlockRange with uint64.

The method now uses uint64 for range queries, simplifying the logic and improving performance. Verify that all related logic correctly handles uint64.


113-117: Update GetByBlockHeight to use uint64.

The method now accepts uint64, improving performance by reducing type conversion overhead. Verify that all related logic correctly handles uint64.

Verification successful

All usages of GetByBlockHeight correctly handle uint64.

The method GetByBlockHeight is used in various parts of the codebase, and all instances correctly handle uint64 parameters. Specifically:

  • In services/logs/filter.go, height values are derived from uint64 sources.
  • The blk.Height is effectively uint64, as it corresponds to the Number field in the api/models.go Block struct, which uses hexutil.Uint64.

No further changes are necessary.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all usages of `GetByBlockHeight` to ensure they handle `uint64` correctly.

# Test: Search for the method usage. Expect: Correct handling of `uint64`.
rg --type go 'GetByBlockHeight'

Length of output: 1169


Script:

#!/bin/bash
# Verify the types of `height` and `blk.Height` in `services/logs/filter.go` to ensure they are `uint64`.
rg --type go 'height' services/logs/filter.go -A 5
rg --type go 'blk' services/logs/filter.go -A 5

Length of output: 1290


Script:

#!/bin/bash
# Verify the type of `blk.Height` by searching for the definition or assignment of `blk`.
rg --type go 'blk' services/logs/filter.go -B 5

Length of output: 268


Script:

#!/bin/bash
# Search for the definition or implementation of the `GetByID` method to determine the type of `blk`.
rg --type go 'GetByID' -A 5

Length of output: 5132


Script:

#!/bin/bash
# Search for the definition of the `models.Block` struct to verify the type of the `Height` field.
rg --type go 'type Block struct' -A 10

Length of output: 1197

services/logs/filter.go (4)

51-51: LGTM! Simplified block height handling.

Changing start and end from *big.Int to uint64 simplifies the code and improves performance.


57-66: LGTM! Improved clarity and efficiency.

The use of uint64 for start and end enhances clarity and performance by allowing direct comparisons.


87-106: LGTM! Optimized filtering process.

The use of uint64 for bloomHeightMatches and direct fetching of receipts improves efficiency. Consider addressing the TODO for concurrent processing.


158-158: LGTM! Simplified receipt fetching.

Using blk.Height directly aligns with the new type handling and simplifies the logic.

storage/pebble/storage_test.go (1)

40-40: LGTM! Enhanced test coverage for larger block heights.

Changing the block height from 30 to 300 broadens the test scenario to ensure robustness.

services/logs/filter_test.go (3)

121-125: LGTM! Simplified receipt storage logic.

Switching to uint64 for block height comparisons enhances performance and simplifies the logic.


138-148: LGTM! Simplified blooms range logic.

Using uint64 for start and end parameters aligns with the new type handling and simplifies range checks.


288-341: LGTM! Updated test cases for new type handling.

Using uint64 for start and end in test cases ensures alignment with updated function signatures.

api/pull.go (1)

480-504: LGTM! But verify the function usage in the codebase.

The transition from big.Int to uint64 for block heights simplifies the logic and improves performance. The error handling for range filter creation is clear and concise.

However, ensure that all function calls to getLogs correctly handle the new uint64 type.

Verification successful

Verify the Usage of getLogs with uint64 Parameters

The LatestEVMHeight function, which is used to retrieve block heights, returns a uint64 type. This ensures compatibility with the getLogs function, which now expects uint64 parameters. Therefore, the use of current as an argument in getLogs is correctly handled as it is derived from LatestEVMHeight.

No changes are needed for the getLogs function call.

  • Function: getLogs
  • Parameters: uint64 (from LatestEVMHeight)

The transition to using uint64 is correctly implemented in the codebase.

Action:: No action required. The code is already correctly using uint64 for block heights.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all function calls to `getLogs` correctly handle the new `uint64` type.

# Test: Search for the function usage. Expect: Correct handling of `uint64` type.
rg --type go -A 5 $'getLogs'

Length of output: 979


Script:

#!/bin/bash
# Description: Verify the type of `current` used in `getLogs` calls to ensure it is `uint64`.

# Test: Search for the definition and type of `current` in `api/pull.go`.
rg --type go 'current' api/pull.go

Length of output: 643


Script:

#!/bin/bash
# Description: Verify the return type of `LatestEVMHeight()` to ensure it is `uint64`.

# Test: Search for the definition of `LatestEVMHeight` to check its return type.
rg --type go 'func LatestEVMHeight' --context 5

Length of output: 49


Script:

#!/bin/bash
# Description: Search for the definition of `LatestEVMHeight` across all Go files to determine its return type.

# Test: Search for `LatestEVMHeight` function definition in all Go files.
rg --type go 'func LatestEVMHeight' --context 5

Length of output: 49


Script:

#!/bin/bash
# Description: Search for `LatestEVMHeight` as a method on any struct or interface to determine its return type.

# Test: Search for `LatestEVMHeight` method definitions in all Go files.
rg --type go 'LatestEVMHeight' --context 5

Length of output: 13618

storage/index_testsuite.go (6)

193-193: LGTM!

The use of uint64 for block heights in the test case improves type safety and readability.


230-236: LGTM!

The use of uint64 for block heights in the test case improves type safety and readability.


245-362: LGTM!

The use of uint64 for block heights in the test case improves type safety and readability.

Tools
golangci-lint

342-342: SA4010: this result of append is never used, except maybe in other appends

(staticcheck)


343-343: SA4010: this result of append is never used, except maybe in other appends

(staticcheck)


283-329: LGTM!

The use of uint64 for block heights in the test case improves type safety and readability.


353-362: LGTM!

The use of uint64 for block heights in the test case improves type safety and readability.


332-349: Verify the necessity of big.Int usage.

The use of big.Int for block heights in this test case seems inconsistent with other changes that use uint64. Verify if this is required for specific test coverage.

Verification successful

Verified: Use of big.Int is consistent and appropriate.

The use of big.Int for block heights in the single height range test case is consistent with its usage throughout the codebase. This suggests it is a deliberate design choice, likely to handle large numbers or ensure consistency across tests and implementations. No changes are necessary.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify if the use of `big.Int` for block heights in test cases is necessary.

# Test: Search for `big.Int` usage in test cases. Expect: Consistent usage across test cases.
rg --type go -A 5 $'big.NewInt'

Length of output: 24150

Tools
golangci-lint

342-342: SA4010: this result of append is never used, except maybe in other appends

(staticcheck)


343-343: SA4010: this result of append is never used, except maybe in other appends

(staticcheck)

api/api.go (1)

Line range hint 652-666: LGTM!

The use of Uint64() for block heights and ensuring consistent return types in the GetLogs method improves type safety and robustness.

Copy link
Collaborator

@m-Peter m-Peter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work 💯💯

@sideninja sideninja merged commit f79ab5b into main Aug 16, 2024
2 checks passed
@m-Peter m-Peter deleted the gregor/log-filter-fix branch August 22, 2024 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

eth_getLogs returns incorrect records when filtering by block number
2 participants