-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat!: versioned timeouts #3882
Conversation
ahh good catch #3981 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[question] now that TimeoutPropose
and TimeoutCommit
are versioned constants, should they be removed from consensus constants here?
test/e2e/major_upgrade_v3.go
Outdated
if b.dur > appconsts.GetTimeoutCommit(b.block.Version.App) { | ||
return fmt.Errorf( | ||
"block was too slow for corresponding version: version %v duration %v upgrade height %v height %v", | ||
b.block.Version.App, | ||
b.dur, | ||
preciseUpgradeHeight, | ||
b.block.Height, | ||
) | ||
} | ||
|
||
if b.dur < appconsts.GetTimeoutCommit(b.block.Version.App) { | ||
return fmt.Errorf( | ||
"block was too fast for corresponding version: version %v duration %v upgrade height %v height %v", | ||
b.block.Version.App, | ||
b.dur, | ||
preciseUpgradeHeight, | ||
b.block.Height, | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[question] it looks like this will throw an error if b.dur doesn't exactly match the timeout commit which seems extremely fragile. I would expect this test to pass if blocks times are ~6 seconds which is > the timeout commit of 3.5 seconds. Maybe I'm missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh whoops yeah good point, will fix in a flup
knuu isn't working still so this test was only manually performed and this wasn't hit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (4)
pkg/appconsts/versioned_consts.go (2)
59-68
: LGTM: GetTimeoutPropose function implementation.The function correctly implements versioned timeout propose values, aligning with the PR objectives. It handles different versions appropriately and provides a default case for unspecified versions.
Consider adding a comment explaining the default case behavior for future maintainability.
func GetTimeoutPropose(v uint64) time.Duration { switch v { case v1.Version: return v1.TimeoutPropose case v2.Version: return v2.TimeoutPropose default: + // For v3 and any future versions, use v3 timeout return v3.TimeoutPropose } }
70-79
: LGTM: GetTimeoutCommit function implementation.The function correctly implements versioned timeout commit values, consistent with the GetTimeoutPropose function and aligning with the PR objectives.
Consider adding a comment explaining the default case behavior, similar to the suggestion for GetTimeoutPropose.
func GetTimeoutCommit(v uint64) time.Duration { switch v { case v1.Version: return v1.TimeoutCommit case v2.Version: return v2.TimeoutCommit default: + // For v3 and any future versions, use v3 timeout return v3.TimeoutCommit } }
app/app.go (2)
461-461
: Grammar Correction in CommentAdd a comma after "v1 only" and hyphenate "agreed upon" to "agreed-upon" for clarity.
Apply this diff to improve the comment:
-// For v1 only we upgrade using an agreed upon height known ahead of time +// For v1 only, we upgrade using an agreed-upon height known ahead of time
548-548
: Complete the Incomplete CommentThe comment seems incomplete. It currently reads: "baseapp's InitChain so that we can take the app version and setup the multicommit". Consider completing the sentence for clarity.
Apply this diff to fix the comment:
-// baseapp's InitChain so that we can take the app version and setup the multicommit +// baseapp's InitChain so that we can take the app version and set up the multicommit store.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (11)
- app/app.go (5 hunks)
- app/module/configurator_test.go (1 hunks)
- app/test/upgrade_test.go (7 hunks)
- pkg/appconsts/global_consts.go (0 hunks)
- pkg/appconsts/v1/app_consts.go (1 hunks)
- pkg/appconsts/v2/app_consts.go (1 hunks)
- pkg/appconsts/v3/app_consts.go (1 hunks)
- pkg/appconsts/versioned_consts.go (2 hunks)
- x/signal/integration_test.go (1 hunks)
- x/signal/keeper.go (3 hunks)
- x/signal/keeper_test.go (4 hunks)
💤 Files with no reviewable changes (1)
- pkg/appconsts/global_consts.go
🚧 Files skipped from review as they are similar to previous changes (4)
- app/test/upgrade_test.go
- pkg/appconsts/v1/app_consts.go
- pkg/appconsts/v2/app_consts.go
- pkg/appconsts/v3/app_consts.go
🧰 Additional context used
🔇 Additional comments (11)
pkg/appconsts/versioned_consts.go (2)
7-8
: LGTM: New imports for versioned constants.The addition of imports for
v1
andv2
packages is appropriate for accessing version-specific constants. This approach supports the versioned logic implemented in the new functions.
Line range hint
1-99
: Overall LGTM: Versioned constants implementation meets PR objectives.The changes in this file successfully implement versioned timeout and upgrade height delay values, aligning with the PR objectives. The new functions
GetTimeoutPropose
,GetTimeoutCommit
, andUpgradeHeightDelay
provide a clean and consistent approach to handling different versions.Minor suggestions have been made to improve code clarity and error handling, particularly in the
UpgradeHeightDelay
function. These improvements, while not critical, would enhance the overall quality and maintainability of the code.Great job on implementing this feature!
x/signal/integration_test.go (1)
80-80
: Approved change, but clarification needed onversion
variableThe change from
appconsts.DefaultUpgradeHeightDelay
toappconsts.UpgradeHeightDelay(version)
aligns with the PR objectives of implementing version-specific timeouts. This modification allows for more flexible upgrade scheduling based on the current version.However, a few points need attention:
- Please clarify where the
version
variable is defined and what its value is at this point in the test.- Consider adding a comment explaining the reason for this change, e.g., "Use version-specific upgrade delay for more accurate testing".
- Ensure that the
UpgradeHeightDelay
function is correctly implemented to handle all possible version values.To verify the implementation of
UpgradeHeightDelay
, please run the following script:✅ Verification successful
Change Verified:
UpgradeHeightDelay
Function Implements Version-Based LogicThe
UpgradeHeightDelay
function correctly incorporates version-specific logic using aswitch
statement on the version variable. This ensures that the upgrade delay is dynamically determined based on the current version, aligning with the PR objectives.No issues found with the implemented changes.
- Verified that
UpgradeHeightDelay
is defined inpkg/appconsts/versioned_consts.go
.- Confirmed the presence of version-based conditionals within the function.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the implementation of UpgradeHeightDelay function # Test: Search for the UpgradeHeightDelay function definition echo "Searching for UpgradeHeightDelay function definition:" ast-grep --lang go --pattern 'func UpgradeHeightDelay($version int64) int64 { $$$ }' # Test: Check for any switch statements or conditionals based on version echo "Checking for version-based logic in UpgradeHeightDelay:" rg --type go -A 10 'func UpgradeHeightDelay.*\{' | rg 'switch|if.*version'Length of output: 449
x/signal/keeper.go (3)
8-8
: LGTM: New import added for appconsts package.The addition of this import is necessary for using
appconsts.UpgradeHeightDelay(version)
in theTryUpgrade
method. This change centralizes the upgrade height calculation logic, which is a good practice for maintainability.
107-107
: LGTM: Upgrade height calculation updated in TryUpgrade method.The use of
appconsts.UpgradeHeightDelay(version)
centralizes the upgrade height delay logic and potentially allows for version-specific delay values. This change improves the flexibility and maintainability of the upgrade process.Please ensure that the
UpgradeHeightDelay
function in theappconsts
package is correctly implemented. Run the following script to verify:#!/bin/bash # Description: Check the implementation of UpgradeHeightDelay function # Test: Display the UpgradeHeightDelay function implementation ast-grep --lang go --pattern $'func UpgradeHeightDelay(version uint64) int64 { $$$ }'
54-56
: LGTM: NewKeeper function signature updated.The removal of the
upgradeHeightDelayBlocks
parameter aligns with the changes in theKeeper
struct and simplifies the initialization process.Please ensure that all calls to
NewKeeper
across the codebase have been updated to reflect this change. Run the following script to verify:x/signal/keeper_test.go (3)
186-186
: Version-specific upgrade height delayThe
UpgradeHeightDelay
function is now called with aversion
parameter. This change allows for version-specific upgrade height delays, providing more flexibility in the upgrade process.To ensure this change is applied consistently, please run the following script:
#!/bin/bash # Description: Check for usage of UpgradeHeightDelay function # Search for all calls to UpgradeHeightDelay rg --type go "UpgradeHeightDelay\(" # Check if there are any remaining uses of DefaultUpgradeHeightDelay rg --type go "DefaultUpgradeHeightDelay"
429-429
: Updated assertion for upgrade heightThe assertion for
UpgradeHeight
now usesappconsts.UpgradeHeightDelay(v2.Version)
, which is consistent with the version-specific upgrade height delay approach.To ensure this change is applied consistently across all test cases, please run the following script:
#!/bin/bash # Description: Check for assertions related to UpgradeHeight in test files # Search for assertions related to UpgradeHeight rg --type go "assert.*UpgradeHeight.*UpgradeHeightDelay" test/
72-72
: Keeper initialization updatedThe
NewKeeper
function call has been modified to remove theappconsts.DefaultUpgradeHeightDelay
parameter. This change reflects an update in the keeper's initialization process.To ensure this change is consistent across the codebase, please run the following script:
app/app.go (2)
283-283
: SimplifySignalKeeper
InitializationThe
SignalKeeper
is correctly initialized without theUpgradeHeightDelay()
parameter, aligning with the updated design and simplifying the instantiation.
541-542
: EnsureAppVersion
Handles Zero ValueAs previously noted, if this method is called when a node first starts up,
resp.AppVersion
may be zero. Ensure thatGetTimeoutPropose
andGetTimeoutCommit
can handle a zeroAppVersion
without causing errors.
Yeah they probably can be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (5)
test/e2e/major_upgrade_v3.go (5)
42-43
: LGTM: Dynamic version retrieval implementedReplacing the hardcoded version with
testnet.GetLatestVersion()
is a good improvement. It makes the test more maintainable and ensures it always uses the latest version.Consider adding a log statement to print the retrieved version for better debugging:
logger.Printf("Using latest version: %s", latestVersion)
77-83
: LGTM: Adjusted timer and ticker for upgrade detectionThe changes to the timer and ticker are good improvements:
- Increasing the timer to 20 minutes allows more time for the upgrade to complete on slower systems.
- Decreasing the ticker interval to 3 seconds enables more frequent checks, potentially detecting the upgrade completion earlier.
Consider making these durations configurable constants at the top of the file for easier adjustment:
const ( upgradeTimeout = 20 * time.Minute checkInterval = 3 * time.Second )Then use these constants in the timer and ticker initialization.
Line range hint
84-102
: LGTM: Improved upgrade detection with height trackingThe addition of
upgradedHeight
tracking is a good improvement:
- It allows for precise identification of the upgrade point.
- The logic correctly sets
upgradedHeight
only for the first node that completes the upgrade.Consider adding a log statement when the upgrade is detected:
if upgradedHeight == 0 { upgradedHeight = resp.Header.Height logger.Printf("Upgrade to v3 detected at height: %d", upgradedHeight) }This will provide clearer logging of when the upgrade occurs.
110-142
: LGTM: Added comprehensive timeout verification after upgradeThe new logic for verifying timeouts after the upgrade is a valuable addition to the test:
- It checks block times around the upgrade height, ensuring the upgrade's effect on timeouts is correctly implemented.
- The
versionDuration
struct efficiently stores both duration and block information.- The logic correctly handles the transition from v2 to v3 by checking app versions.
Consider adding error handling for the
client.Block(ctx, &h)
call:resp, err := client.Block(ctx, &h) if err != nil { return fmt.Errorf("failed to get block at height %d: %w", h, err) }This will provide more specific error information if block retrieval fails.
144-176
: LGTM: Comprehensive post-upgrade block analysis implementedThe new logic for analyzing block times and versions after the upgrade significantly enhances the test:
- It correctly identifies the precise upgrade height.
- The handling of blocks with multiple rounds accounts for real-world scenarios.
- The check for appropriate block durations ensures the upgrade correctly affects block timing.
Consider adding more detailed logging for better visibility into the test progress:
if b.block.Version.App == v3.Version && preciseUpgradeHeight == 0 { preciseUpgradeHeight = int(b.block.Height) logger.Printf("Precise upgrade height detected: %d", preciseUpgradeHeight) continue } if b.block.LastCommit.Round > 0 { multipleRounds++ logger.Printf("Multiple rounds detected at height %d", b.block.Height) continue }This will provide more insight into the upgrade process and any anomalies detected during the test.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (2)
- pkg/appconsts/consensus_consts.go (1 hunks)
- test/e2e/major_upgrade_v3.go (5 hunks)
🧰 Additional context used
🔇 Additional comments (5)
pkg/appconsts/consensus_consts.go (2)
Line range hint
1-12
: Verify impact of removing TimeoutPropose and TimeoutCommitThe constants
TimeoutPropose
andTimeoutCommit
have been removed from this file. This change could have significant implications for the consensus mechanism.Let's verify the impact of removing these constants:
#!/bin/bash # Description: Check for references to removed constants and potential replacements # Search for TimeoutPropose and TimeoutCommit usage echo "Removed constants usage:" rg --type go "TimeoutPropose|TimeoutCommit" # Search for potential replacements or new timeout management echo "\nPotential new timeout management:" rg --type go "(?i)timeout.*(?:propose|commit)" # Search for changes in consensus-related files echo "\nRecent changes in consensus-related files:" git log -n 5 --pretty=format:"%h - %s" --grep="consensus"Please review the script output to ensure that the removal of these constants is properly handled throughout the codebase, and that any new timeout management is correctly implemented.
10-10
: Verify impact of reduced GoalBlockTimeThe
GoalBlockTime
has been significantly reduced from 15 seconds to 6 seconds. This change aligns with the PR objectives but may have wide-ranging effects on the system's performance and behavior.Let's verify the usage and impact of this change:
Please review the script output to ensure that this change is consistent with other parts of the codebase and doesn't introduce any conflicts or unexpected behavior.
test/e2e/major_upgrade_v3.go (3)
10-15
: LGTM: New imports added for version-specific constants and Tendermint typesThe addition of
appconsts
,v2
,v3
, andtmtypes
imports is appropriate for the upgrade test. These imports will allow the use of version-specific constants and Tendermint block types, which are necessary for implementing and verifying the upgrade process.
52-52
: LGTM: Consistent use of dynamically retrieved versionThe changes in lines 52 and 58 correctly use the
latestVersion
variable for adding the Docker image and creating genesis nodes. This ensures consistency with the earlier change and maintains the flexibility of the test.Also applies to: 58-58
69-69
: LGTM: Consistent use of dynamically retrieved version for tx clientThe change in line 69 correctly uses the
latestVersion
variable for creating the transaction client. This maintains consistency with the earlier changes and ensures that the correct version is used throughout the test setup.
pkg/appconsts/consensus_consts.go
Outdated
GoalBlockTime = time.Second * 15 | ||
GoalBlockTime = time.Second * 6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can technically update this, as it is only ever updated in the genesis state, but it correctly causes the app hash tests to fail so its staying 15s for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbc, this change has been reverted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea +1 on not modifying this rn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks for pushing this through. 116 commits 🤯
- I left a comment about modifying GoalBlockTime that we can address in a FLUP
- In the future, would be nice if PR description contained a testing section so reviewers know how this change was manually tested
TimeoutCommit = time.Millisecond * 4200 | ||
// UpgradeHeightDelay is the number of blocks after a quorum has been | ||
// reached that the chain should upgrade to the new version. Assuming a block | ||
// interval of 12 seconds, this is 7 days. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// interval of 12 seconds, this is 7 days. | |
// interval of 6 seconds, this is 7 days. |
return v2.UpgradeHeightDelay | ||
default: | ||
return v3.UpgradeHeightDelay | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pkg/appconsts/consensus_consts.go
Outdated
// GoalBlockTime is the target time interval between blocks. Since the block | ||
// interval isn't enforced at consensus, the real block interval isn't | ||
// guaranteed to exactly match GoalBlockTime. GoalBlockTime is currently targeted | ||
// through static timeouts (i.e. TimeoutPropose, TimeoutCommit). | ||
GoalBlockTime = time.Second * 15 | ||
GoalBlockTime = time.Second * 6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modifying this constant will impact some of the defaults in default_overrides. If we do this, there are comments that need to be updated. For example:
also we prob need to update specs with the new defaults even though the old defaults are the ones that are running on mainnet.
So we should prob reconsider modifying this constant. IMO the easiest thing to do would be to delete this constant and modify the defaults that use it to hard-coded values that match the current implementation (so based on 15 seconds).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah good catch, I realized directly after I asked for more reviews and snuck in a reversion and it doesn't need to be changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (4)
test/e2e/major_upgrade_v3.go (4)
42-43
: Improved version managementThe use of
testnet.GetLatestVersion()
to dynamically retrieve the latest version is a significant improvement. It makes the test more flexible and easier to maintain. The consistent use oflatestVersion
for Docker image addition and genesis node creation is also good.Consider adding error handling for the case where
GetLatestVersion()
returns an empty string or an invalid version:if latestVersion == "" { return fmt.Errorf("failed to retrieve a valid latest version") }This would make the test more robust against potential issues with version retrieval.
Also applies to: 52-52, 58-58
77-77
: Enhanced upgrade waiting logicThe increase in the timer duration to 20 minutes and the introduction of
upgradedHeight
are good improvements. These changes make the test more robust and provide valuable information for subsequent checks.Consider adding a log statement when the upgrade is detected:
if upgradeComplete { logger.Printf("Upgrade to v3 detected at height %d", upgradedHeight) }This would provide more visibility into the upgrade process during test execution.
Also applies to: 83-102
110-187
: Comprehensive block time verificationThe addition of detailed block time checks is an excellent improvement to the test. It verifies that the upgrade actually changes the block times as expected, which is crucial for ensuring the correctness of the upgrade process.
The checks for multiple rounds and the precise upgrade height show good attention to detail and account for edge cases. The use of
appconsts.GetTimeoutCommit()
for version-specific timeout checks is also a good practice.Consider adding a summary log at the end of the checks:
logger.Printf("Block time checks completed. Precise upgrade height: %d, Multiple rounds: %d", preciseUpgradeHeight, multipleRounds)This would provide a quick overview of the test results.
Line range hint
18-189
: Overall excellent improvements to the upgrade testThe
MajorUpgradeToV3
function has been significantly enhanced. Key improvements include:
- Dynamic version retrieval for better maintainability.
- Increased upgrade waiting time for improved reliability.
- Precise tracking of the upgrade height.
- Comprehensive block time checks to verify the effects of the upgrade.
These changes align well with the PR objectives and greatly improve the test's ability to verify the correct implementation of timeout overrides during the v3 upgrade.
Consider adding more detailed logging throughout the function to improve observability during test execution. This would make debugging easier if issues arise in the future.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (2)
- pkg/appconsts/consensus_consts.go (0 hunks)
- test/e2e/major_upgrade_v3.go (5 hunks)
💤 Files with no reviewable changes (1)
- pkg/appconsts/consensus_consts.go
🧰 Additional context used
🔇 Additional comments (1)
test/e2e/major_upgrade_v3.go (1)
10-12
: Improved imports and upgrade heightThe addition of version-specific imports and the increase of
upgradeHeightV3
to 40 are good improvements. This allows for more precise version checking and gives the network more time to stabilize before the upgrade.Also applies to: 21-21
@rootulp good idea, added a link to the comment and we can also test in knuu now. it appears to pass if a I manually log it and check, but there's an rpc that keeps failing that seems unrelated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ship it!
(But the override no longer works so we need to fix that in a later PR)
UpgradeHeight: sdkCtx.BlockHeader().Height + k.upgradeHeightDelayBlocks, | ||
UpgradeHeight: sdkCtx.BlockHeader().Height + appconsts.UpgradeHeightDelay(version), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we can't override it :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it does in that function afaiu, but I did not test
celestia-app/pkg/appconsts/versioned_consts.go
Lines 83 to 89 in fbbd9e6
if OverrideUpgradeHeightDelayStr != "" { | |
parsedValue, err := strconv.ParseInt(OverrideUpgradeHeightDelayStr, 10, 64) | |
if err != nil { | |
panic("Invalid OverrideUpgradeHeightDelayStr value") | |
} | |
return parsedValue | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yeah I think you might be right (I had skipped over that part)
We could probably move the string parsing to an init
function
@@ -33,6 +34,7 @@ type Context struct { | |||
goContext context.Context | |||
client.Context | |||
apiAddress string | |||
tmNode *node.Node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be removed (but you can do it as a FLUP)
@@ -33,6 +33,7 @@ func NewNetwork(t testing.TB, config *Config) (cctx Context, rpcAddr, grpcAddr s | |||
}) | |||
|
|||
cctx = NewContext(ctx, config.Genesis.Keyring(), config.TmConfig, config.Genesis.ChainID, config.AppConfig.API.Address) | |||
cctx.tmNode = tmNode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
Closes #3859
manually tested in #3882 (comment)
also tested in knuu