-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem: version mismatch happen occasionally #1759
base: main
Are you sure you want to change the base?
Conversation
Solution: - make sure don't load iavl version ahead of versiondb
WalkthroughThis pull request adds a changelog entry documenting a bug fix for an occasional version mismatch issue and updates the application’s initialization logic. The Changes
Suggested reviewers
Poem
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Signed-off-by: yihuang <huang@crypto.com>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1759 +/- ##
===========================================
+ Coverage 16.87% 35.56% +18.69%
===========================================
Files 72 127 +55
Lines 6163 11966 +5803
===========================================
+ Hits 1040 4256 +3216
- Misses 5000 7282 +2282
- Partials 123 428 +305
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
app/app.go (1)
1044-1068
: Consider adding explicit logging before exiting on version mismatchRight now, the app silently calls
tmos.Exit(...)
when discovering version mismatches. Logging bothqmsVersion
andiavlVersion
right before forcing an exit would improve debugging when such mismatches happen, providing clearer insights into the root cause of the failure.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
CHANGELOG.md
(1 hunks)app/app.go
(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- CHANGELOG.md
⏰ Context from checks skipped due to timeout of 90000ms (16)
- GitHub Check: integration_tests (gas)
- GitHub Check: integration_tests (slow)
- GitHub Check: integration_tests (upgrade)
- GitHub Check: integration_tests (gov)
- GitHub Check: integration_tests (ica)
- GitHub Check: integration_tests (ibc_update_client)
- GitHub Check: integration_tests (ibc_timeout)
- GitHub Check: integration_tests (ibc_rly_gas)
- GitHub Check: integration_tests (ibc_rly_evm)
- GitHub Check: integration_tests (ibc)
- GitHub Check: integration_tests (unmarked)
- GitHub Check: build (macos-14)
- GitHub Check: build (macos-latest)
- GitHub Check: build (ubuntu-latest)
- GitHub Check: Run golangci-lint
- GitHub Check: Analyze (go)
🔇 Additional comments (1)
app/app.go (1)
1506-1512
: LGTMThis utility function cleanly encapsulates the store loading mechanism for a given version. The approach aligns well with ensuring the IAVL store is loaded at the correct version when versiondb is enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
memiavl/db.go (1)
563-574
: Added version safety checks to prevent WAL corruption.The change adds necessary validation to ensure that an old version is not written to the WAL, which helps prevent version mismatch issues. This is a good improvement for data integrity.
Consider adding a debug log or metric counter to track when these conditions occur, as they may indicate other issues in the system that need addressing.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
memiavl/db.go
(2 hunks)
🧰 Additional context used
🪛 golangci-lint (1.62.2)
memiavl/db.go
[high] 616-616: G115: integer overflow conversion int -> uint64
(gosec)
[high] 617-617: G115: integer overflow conversion int -> uint64
(gosec)
⏰ Context from checks skipped due to timeout of 90000ms (17)
- GitHub Check: build (macos-14)
- GitHub Check: build (macos-latest)
- GitHub Check: build (ubuntu-latest)
- GitHub Check: unittest
- GitHub Check: integration_tests (gas)
- GitHub Check: integration_tests (slow)
- GitHub Check: integration_tests (upgrade)
- GitHub Check: integration_tests (gov)
- GitHub Check: integration_tests (ica)
- GitHub Check: integration_tests (ibc_update_client)
- GitHub Check: integration_tests (ibc_timeout)
- GitHub Check: integration_tests (ibc_rly_gas)
- GitHub Check: integration_tests (ibc_rly_evm)
- GitHub Check: integration_tests (ibc)
- GitHub Check: integration_tests (unmarked)
- GitHub Check: Run golangci-lint
- GitHub Check: Analyze (go)
🔇 Additional comments (1)
memiavl/db.go (1)
603-621
:✅ Verification successful
Enhanced WAL entry validation in async commit path.
This change correctly prevents outdated entries from being written to the WAL in the asynchronous commit path, which aligns with the fix for version mismatch problems mentioned in the PR objectives.
The static analysis tool flagged potential integer overflow issues in the type conversion at lines 616-617. Let's verify this risk:
While this is likely a false positive since
i
comes from the loop overentries
which is unlikely to reach the maximum integer value, it's good to be aware of this edge case.
🏁 Script executed:
#!/bin/bash # Check if the entries slice could ever be large enough to cause integer overflow rg -A 10 "channelBatchRecv" memiavl/db.goLength of output: 488
Review Approved: WAL entry validation safely handles asynchronous commits and integer conversions.
The current implementation correctly prevents outdated WAL entries. Our investigation confirms that:
- The static analysis flag for potential integer overflow (in the conversion at lines 616–617) is a false positive, as the loop counter
i
comes from theentries
slice—which, populated viachannelBatchRecv
, is never expected to grow large enough to trigger an overflow.- The verified behavior of
channelBatchRecv
confirms that the slice size remains within safe boundaries.No modifications are required here.
🧰 Tools
🪛 golangci-lint (1.62.2)
[high] 616-616: G115: integer overflow conversion int -> uint64
(gosec)
[high] 617-617: G115: integer overflow conversion int -> uint64
(gosec)
convert to draft for now, probably no existing integration test cover this. |
Signed-off-by: yihuang <huang@crypto.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (10)
memiavl/db_test.go (2)
532-532
: Fix typo in comment"intermidiate" should be "intermediate"
- // reload db from disk at an intermidiate version + // reload db from disk at an intermediate version🧰 Tools
🪛 golangci-lint (1.62.2)
532-532:
intermidiate
is a misspelling ofintermediate
(misspell)
🪛 GitHub Check: Run golangci-lint
[failure] 532-532:
intermidiate
is a misspelling ofintermediate
(misspell)
509-521
: Consider parameterizing the test with different data patternsThe test currently uses a single pattern for generating test data. For more thorough testing, consider parameterizing with different data patterns or randomized values.
- for i := 0; i < 10; i++ { + numChangeSets := 10 + for i := 0; i < numChangeSets; i++ {memiavl/db.go (2)
446-452
: Fix typo in error message"propogate" should be "propagate"
- // background snapshot rewrite don't success, but no error to propogate, ignore it. + // background snapshot rewrite don't success, but no error to propagate, ignore it.🧰 Tools
🪛 golangci-lint (1.62.2)
451-451:
propogate
is a misspelling ofpropagate
(misspell)
🪛 GitHub Check: codecov/patch
[warning] 446-449: memiavl/db.go#L446-L449
Added lines #L446 - L449 were not covered by tests
[warning] 452-452: memiavl/db.go#L452
Added line #L452 was not covered by tests🪛 GitHub Check: Run golangci-lint
[failure] 451-451:
propogate
is a misspelling ofpropagate
(misspell)
609-613
: Add test coverage for WAL error handlingThe error handling path for WAL LastIndex failures in async commits is not covered by tests according to the code coverage report. Consider adding tests that simulate these failures.
Would you like me to help draft tests that would cover these error scenarios?
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 610-612: memiavl/db.go#L610-L612
Added lines #L610 - L612 were not covered by testsapp/storeloader.go (2)
27-32
: Add comments explaining the upgrade logic flowThe upgrade logic could benefit from more detailed comments explaining the different paths and conditions. This would make maintenance easier for future developers.
return func(ms storetypes.CommitMultiStore) error { if upgradeHeight == ms.LastCommitID().Version+1 { // Check if the current commit version and upgrade height matches + // If there are store upgrades, we need to apply them regardless of the version cap if len(storeUpgrades.Renamed) > 0 || len(storeUpgrades.Deleted) > 0 || len(storeUpgrades.Added) > 0 { return ms.LoadLatestVersionAndUpgrade(storeUpgrades) } } + // No upgrades to apply, or not at the upgrade height, so just load the capped version // Otherwise load default store loader return MaxVersionStoreLoader(version)(ms)
1-38
: Add unit tests for the store loadersThese functions lack test coverage according to the coverage report. Since they're critical for version control and upgrades, they should have dedicated unit tests.
Would you like me to help draft unit tests for these store loader functions?
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 15-17: app/storeloader.go#L15-L17
Added lines #L15 - L17 were not covered by tests
[warning] 21-24: app/storeloader.go#L21-L24
Added lines #L21 - L24 were not covered by tests
[warning] 26-31: app/storeloader.go#L26-L31
Added lines #L26 - L31 were not covered by tests
[warning] 35-35: app/storeloader.go#L35
Added line #L35 was not covered by testsintegration_tests/test_versiondb.py (2)
40-42
: Consider making the block wait count configurable.
Using a hard-coded wait of 5 blocks here may be too rigid. In environment-specific or slower chains, you might need a different number of blocks to ensure consistent test results.
52-54
: Fix minor spelling error in comment.
There's a typo in "intermidiate". Please apply the below fix:- # only restore to an intermidiate version to test version mismatch behavior + # only restore to an intermediate version to test version mismatch behaviorapp/app.go (2)
962-965
: Handle setupVersionDB errors more gracefully.
Currently, any setup error leads to a panic. Depending on your requirements, consider logging and terminating with a more descriptive message or permitting fallback behavior to avoid abrupt termination.
978-982
: Correct the variable name spelling.
“storeLoaderOverritten” should ideally be “storeLoaderOverridden” to avoid typos and maintain clarity.- storeLoaderOverritten := app.RegisterUpgradeHandlers(app.appCodec, qmsVersion) + storeLoaderOverridden := app.RegisterUpgradeHandlers(app.appCodec, qmsVersion)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
CHANGELOG.md
(1 hunks)app/app.go
(2 hunks)app/storeloader.go
(1 hunks)app/upgrades.go
(2 hunks)integration_tests/shell.nix
(1 hunks)integration_tests/test_versiondb.py
(3 hunks)memiavl/db.go
(6 hunks)memiavl/db_test.go
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- CHANGELOG.md
🧰 Additional context used
🪛 golangci-lint (1.62.2)
memiavl/db_test.go
532-532: intermidiate
is a misspelling of intermediate
(misspell)
memiavl/db.go
451-451: propogate
is a misspelling of propagate
(misspell)
🪛 GitHub Check: Run golangci-lint
memiavl/db_test.go
[failure] 532-532:
intermidiate
is a misspelling of intermediate
(misspell)
memiavl/db.go
[failure] 451-451:
propogate
is a misspelling of propagate
(misspell)
🪛 GitHub Check: codecov/patch
app/storeloader.go
[warning] 15-17: app/storeloader.go#L15-L17
Added lines #L15 - L17 were not covered by tests
[warning] 21-24: app/storeloader.go#L21-L24
Added lines #L21 - L24 were not covered by tests
[warning] 26-31: app/storeloader.go#L26-L31
Added lines #L26 - L31 were not covered by tests
[warning] 35-35: app/storeloader.go#L35
Added line #L35 was not covered by tests
app/upgrades.go
[warning] 58-58: app/upgrades.go#L58
Added line #L58 was not covered by tests
[warning] 64-65: app/upgrades.go#L64-L65
Added lines #L64 - L65 were not covered by tests
memiavl/db.go
[warning] 446-449: memiavl/db.go#L446-L449
Added lines #L446 - L449 were not covered by tests
[warning] 452-452: memiavl/db.go#L452
Added line #L452 was not covered by tests
[warning] 574-575: memiavl/db.go#L574-L575
Added lines #L574 - L575 were not covered by tests
[warning] 610-612: memiavl/db.go#L610-L612
Added lines #L610 - L612 were not covered by tests
[warning] 770-771: memiavl/db.go#L770-L771
Added lines #L770 - L771 were not covered by tests
[warning] 1119-1120: memiavl/db.go#L1119-L1120
Added lines #L1119 - L1120 were not covered by tests
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: build (ubuntu-latest)
🔇 Additional comments (14)
memiavl/db_test.go (2)
502-547
: Good test for idempotent writing behavior!This test effectively validates that the database can produce the same state when reloaded at an intermediate version and then replayed with changes. This is crucial for ensuring consistency during recovery scenarios.
🧰 Tools
🪛 golangci-lint (1.62.2)
532-532:
intermidiate
is a misspelling ofintermediate
(misspell)
🪛 GitHub Check: Run golangci-lint
[failure] 532-532:
intermidiate
is a misspelling ofintermediate
(misspell)
533-541
:❓ Verification inconclusive
Verify edge case scenarios for version loading
The test loads at version 5 and replays from there, but doesn't test edge cases like loading at the first version or at the latest version minus one. These scenarios could reveal different behaviors.
🏁 Script executed:
#!/bin/bash # Description: Check if the test handles all edge cases for version loading. # Test additional version scenarios in other tests rg -A 3 "TargetVersion" memiavl/db_test.goLength of output: 255
Revisit Version Loading Edge Cases
Inmemiavl/db_test.go
(lines 533–541), the test only loads a fixed version (5) before replaying changes. However, there aren’t any tests covering edge scenarios such as loading the very first version or one just before the latest. Testing these cases could uncover subtle differences in behavior.
- Consider adding a test for loading the initial version (e.g. version 0 or 1, per system design).
- Consider adding a test for loading near the latest version (e.g. latest version - 1) to ensure robustness.
memiavl/db.go (3)
84-85
: Good optimization with reusable write batchAdding a reusable write batch is a good optimization that avoids frequent allocations during write operations.
567-578
: Improved batch-based commit mechanismThe previous implementation was writing directly to the WAL. The new batch-based approach provides better atomicity and error handling.
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 574-575: memiavl/db.go#L574-L575
Added lines #L574 - L575 were not covered by tests
1116-1128
: Good implementation of idempotent writesThe
writeEntry
function now properly handles idempotent writes by checking if the entry's index is less than or equal to the last index. This prevents duplicate entries and logs an appropriate error message.🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 1119-1120: memiavl/db.go#L1119-L1120
Added lines #L1119 - L1120 were not covered by testsapp/storeloader.go (2)
10-18
: Good abstraction for version-controlled store loadingThis function provides a clean abstraction for loading stores with version control, which is essential for addressing the version mismatch issue mentioned in the PR objectives.
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 15-17: app/storeloader.go#L15-L17
Added lines #L15 - L17 were not covered by tests
21-37
: Well-structured upgrade store loader with version controlThe upgrade store loader correctly handles the different scenarios: using default upgrade loader when version is 0, checking for store upgrades when at the upgrade height, and falling back to max version loading.
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 21-24: app/storeloader.go#L21-L24
Added lines #L21 - L24 were not covered by tests
[warning] 26-31: app/storeloader.go#L26-L31
Added lines #L26 - L31 were not covered by tests
[warning] 35-35: app/storeloader.go#L35
Added line #L35 was not covered by testsintegration_tests/shell.nix (1)
20-20
: Setting TMPDIR is a good practiceSetting the TMPDIR environment variable to a standard location ensures consistent behavior across different development environments and helps prevent issues where temporary directories might be created in unexpected locations.
integration_tests/test_versiondb.py (1)
8-8
: Import usage looks good.
No immediate concerns. The neww3_wait_for_new_blocks
utility function is correctly imported and used below.app/upgrades.go (4)
20-21
: Docstring alignment with return value looks good.
The updated function now returns a boolean to indicate whether the store loader is overridden, matching the new comment.
58-58
: Confirm test coverage for store loader logic.
Setting the store loader here is critical, yet lines involvingMaxVersionUpgradeStoreLoader
may lack coverage. Ensure there's a test path that triggers this call under upgrade conditions.🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 58-58: app/upgrades.go#L58
Added line #L58 was not covered by tests
64-64
: Early return from upgrade handler.
Returningtrue
here is straightforward, but be aware that it could remain untested if no test scenario explicitly checks an upgrade plan with this name. Consider adding or verifying coverage.🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 64-65: app/upgrades.go#L64-L65
Added lines #L64 - L65 were not covered by tests
69-69
: Store loader default path.
Returningfalse
here ensures the default loader is used when the plan name does not match. Confirm there's a test scenario covering this branch.app/app.go (1)
971-974
: Initializing qmsVersion is straightforward.
This block correctly falls back to zero ifapp.qms
is not enabled. No issues identified.
if entry.index <= lastIndex { | ||
logger.Error("commit old version idempotently", "lastIndex", lastIndex, "version", entry.index) | ||
} else { | ||
batch.Write(entry.index, bz) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
❓ Verification inconclusive
Consider adding test coverage for error paths
The error handling path for outdated entries is not covered by tests according to the code coverage report. Consider adding tests that specifically exercise this condition.
🏁 Script executed:
#!/bin/bash
# Description: Check test coverage for writeEntry error paths
# See if there are tests for commit with outdated versions
rg -A 5 "lastIndex.*version" memiavl/db_test.go
Length of output: 49
Test Coverage Missing for Outdated Entry Handling
The search in memiavl/db_test.go
did not reveal any tests that exercise the error path when entry.index <= lastIndex
. Please add tests to simulate this condition to ensure the error handling is properly verified.
- Location:
memiavl/db.go
(Lines 1122-1126)
if qmsVersion > 0 { | ||
// it should not happens since we constraint the loaded iavl version to not exceed the versiondb version, | ||
// still keep the check for safety. | ||
iavlVersion := app.LastBlockHeight() | ||
if qmsVersion < iavlVersion { | ||
// try to prevent gap being created in versiondb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider more user-friendly handling of version mismatch.
Forcing an exit on mismatch provides immediate safety but may hinder operation in production environments. Logging the discrepancy and offering a guided resolution could improve the troubleshooting experience.
Solution:
👮🏻👮🏻👮🏻 !!!! REFERENCE THE PROBLEM YOUR ARE SOLVING IN THE PR TITLE AND DESCRIBE YOUR SOLUTION HERE !!!! DO NOT FORGET !!!! 👮🏻👮🏻👮🏻
PR Checklist:
make
)make test
)go fmt
)golangci-lint run
)go list -json -m all | nancy sleuth
)Thank you for your code, it's appreciated! :)
Summary by CodeRabbit
Summary by CodeRabbit
Bug Fixes
Refactor
New Features
Tests