-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: enhanced reliability of eth RPC methods with null checks and retry mechanisms #3349
base: main
Are you sure you want to change the base?
Conversation
Test Results 19 files - 4 261 suites - 88 33m 6s ⏱️ - 16m 6s For more details on these failures, see this check. Results for commit 8687322. ± Comparison against base commit 13d09e9. This pull request removes 5 and adds 1 tests. Note that renamed tests count towards both.
♻️ This comment has been updated with latest results. |
675eb4a
to
d265305
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
Some considerations
@@ -334,7 +335,7 @@ export class CommonService implements ICommonService { | |||
new Log({ | |||
address: log.address, | |||
blockHash: toHash32(log.block_hash), | |||
blockNumber: numberTo0x(log.block_number), | |||
blockNumber: nullableNumberTo0x(log.block_number), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: Why would a block number be null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is one of the strange cases where block_number
, transaction_index
, or blockHash
are falsy, leading to 500 errors. The retry mechanism should address the issue. Should I revert this change and throw a more informative error message even if those fields remain falsy after the retry? Hmm maybe that would be better so the clients can resubmit the requests themselves other than having null in the response. Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, if a retry is ensured to resolve the issue then maybe we can manage one iteration in the short term.
We should open up an issue in the mirror node repo to investigate this issue so this can be resolved.
@@ -620,7 +620,10 @@ export class MirrorNodeClient { | |||
requestDetails, | |||
); | |||
|
|||
await this.cacheService.set(cachedLabel, block, MirrorNodeClient.GET_BLOCK_ENDPOINT, requestDetails); | |||
if (block) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if I understand correctly, but the errors described in the issue were because of caching null blocks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an issue I noticed and an attempt securing the code. While it may or may not be directly related to the problem, we should ensure the block is not null before storing it in the cache. Otherwise, we risk repeatedly retrieving null from the cache, which likely has a TTL.
936f1e9
to
b959fc7
Compare
@@ -2531,11 +2544,12 @@ export class EthImpl implements Eth { | |||
if (blockResponse == null) return null; | |||
const timestampRange = blockResponse.timestamp; | |||
const timestampRangeParams = [`gte:${timestampRange.from}`, `lte:${timestampRange.to}`]; | |||
const contractResults = await this.mirrorNodeClient.getContractResults( | |||
const contractResults = await this.mirrorNodeClient.getContractResultWithRetry( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we just being cautious or does this case actually need to have retry support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The contractResults
here was a contributing factor to the 500 errors. In the rare cases mentioned, contractResults
contains null fields. When passed to buildReceiptRootHashes
, the function does not handle these null values correctly, resulting in 500 errors. Therefore, this scenario requires a retry if such a rare case occurs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since these cases occur only on the mainnet and at the time of the transaction, I plan to release these updates that include additional debugging information for improved analysis. If the issues persist, it will be easier to pinpoint the root causes and escalate them to the mainnet team more effectively.
19d9838
to
e1c8fd0
Compare
3b96fc4
to
aad493c
Compare
Signed-off-by: Logan Nguyen <logan.nguyen@swirldslabs.com>
Signed-off-by: Logan Nguyen <logan.nguyen@swirldslabs.com>
Signed-off-by: Logan Nguyen <logan.nguyen@swirldslabs.com>
a5a0e1f
to
78fa644
Compare
…Contract MN methods Signed-off-by: Logan Nguyen <logan.nguyen@swirldslabs.com>
bac3c74
to
d7f5829
Compare
this.logger.warn( | ||
e, | ||
`${requestDetails.formattedRequestId} Transaction failed while executing transaction via the SDK: transactionId=${transaction.transactionId}, callerName=${callerName}, txConstructorName=${txConstructorName}`, | ||
); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for Rereviewers:
This is an effort to enhance debugging.
await relay.pollForValidTransactionReceipt(transactionHash); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for Rereviewers:
This is an effort to stablize the flaky test.
Signed-off-by: Logan Nguyen <logan.nguyen@swirldslabs.com>
Signed-off-by: Logan Nguyen <logan.nguyen@swirldslabs.com>
Signed-off-by: Logan Nguyen <logan.nguyen@swirldslabs.com> Revert "fix: reverted licenses" This reverts commit d3c860a. Reapply "fix: reverted licenses" This reverts commit 50a9acd5031490f3f805042a70bfdae7c589146d.
Signed-off-by: Logan Nguyen <logan.nguyen@swirldslabs.com>
d6756f0
to
8687322
Compare
Quality Gate failedFailed conditions |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3349 +/- ##
==========================================
- Coverage 85.04% 84.99% -0.06%
==========================================
Files 65 69 +4
Lines 4508 4717 +209
Branches 1023 1061 +38
==========================================
+ Hits 3834 4009 +175
- Misses 330 394 +64
+ Partials 344 314 -30
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Description:
Numerous requests for
eth_getBlockByNumber
andeth_getBlockByHash
have been reported as failing with errors such asCannot read properties of null
andReceived an invalid integer type: null
.Additionally,
eth_getTransactionReceipt
is also reported to be failing withError invoking RPC: Invalid parameter: hashOrNumber
.These issues stem from regressions in the
getBlock()
method, which lacks adequate null checks in certain areas. This pull request addresses the problem by implementing proper null checks and enhancing fallback mechanisms and error handling.Partially, the results of contracts and log responses from the Mirror Node occasionally contain undefined fields, such as transaction_index, block_hash, block_number, and log_index. This issue may arise because the Mirror Node database does not have sufficient time to save all the information before it is fetched. This PR adds a retry mechanism to the fetching methods that checks for these cases, waits for a specified duration, and then retries the fetch operation.
Related issue(s):
Fixes #3345
Fixes #3351
Notes for reviewer:
Checklist