-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve off-chain worker logic #932
Comments
Update: |
Update: tfchain/substrate-node/pallets/pallet-smart-contract/src/billing.rs Lines 727 to 738 in 8dfb941
Every validator account has an associated validator ID, in some cases this may just be the same as the account ID. on other cases, the validator ID would be the account ID of the controller. I tracked the validators that have these issue (author account != validator account). marked below with **:
solution 1:
solution 2:
I already discussed with @coesensbert the solution 1 as it is the fastest one, given that about half of our main net validators skip billing ATM. Also it is good to communicate that there will be a spike in billing? All affected contracts will be charged the big due amounts suddenly when this get fixed @xmonader I will update also the relevant operations ticket. |
We went with solution 1 and set a new session key for affected validators on devnet to meet the assumption that the aura key is the same as the controller key to detect the next author. |
Update: But I want to delve into some thoughts I have regarding this: The offchain API not support checking if the offchain worker running on the next block author. Dylan’s code serves as a workaround (a good one indeed), but it’s essential to be cautious regarding key setup:
Furthermore, even if we ensure that only the offchain worker running on the next block author submits these transactions, (I think) there’s still a possibility that they will not be included in the transaction set chosen by the validator for that block, especially in busy/congested network conditions where there are many pending transactions in the TXs pool already.so it theoretical can end up in another block. The main question is: What benefits do we gain by checking that this validator becomes the next block author? Currently, this remains unclear to me and to determine the right approach, I need clear requirements. Can you share your motivation behind this check @DylanVerstraete ? Consider the following scenarios:
As soon as I get a clearer respond of why we initially decided to implement this check, I can revisit the issue to verify whether this check truly meets the requirements at this time, if it’s still valid, and how it can be further improved. |
Main motivation was that a validator submitting the extrinsics would be returned the fees. Otherwise this could lead to the validator account being drained. I don't believe this extrinsic can be made unsigned, there were some issues with that approach.
=> there is no way this extrinsic cannot be abused by anyone else other than validators, since the transaction is unsigned you cannot know who is calling it To me, this was only a temporary implementation anyway. If billing triggers could be implemented in Zos nodes I think that would be better, so validators (read network) would only need to check validity of that transaction. (Has some drawbacks since ip4 would not be billed periodically if a node is down) |
If all validators send transactions and all fees go every time to one of them (block author) how the validators’ accounts can be drained?
You can sign and submit a payload to the unsigned transaction and verify it on runtime to know if it was signed with a key belonging to a validator. |
Distribution of contracts to be billed are uneven
You can |
Thank you, Dylan. It's much clearer now! |
Update: Let me explain why. Second, Dylan mentioned that the reason for this check was to prevent validators from paying TX fees. However, submitting the transaction when the validator is the next block author does not guarantee that the same validator will select it from the TX pool. In congested network conditions, many TXs are submitted to the validators' TX pool. As a result, the transaction submitted by the off-chain worker running on one validator can end up in another validator's block, resulting in paying TX fees to that other validator. Suggestion: Outcome for the suggested flow:
|
Update: To prevent duplicate transactions, billed contracts are tracked in SeenContracts storage. This storage is cleared in on_finalize to prevent unnecessary use of storage space. Summary:
These changes ensure a more robust and efficient billing process. Here is an example that demonstrates the unreliability of the current approach: You can notice that different validators submitted the billContractForBlock transaction at different billing indexes and ended in the same block. Unfortunately, only one belongs to the current author, and the others incurred fees. |
Update: Also, as long as at least one validator is running an off-chain worker, billing should work fine. Kinda a blocker that need more research: I am currently looking into implementing the validation before the execution phase, specifically at the stage of transaction pool checks. This validation occurs early on, when the transaction is examined for factors such as a valid origin. If the transaction fails this validation, it will not be included in the transaction pool or shared with other nodes. Researching this would extend the time required for this fix, but if successful, it would better meet our requirements. |
Update: |
Update:I experimented with an alternative approach that introduces a SignedExtension to the pallet-smart-contract module that ensures unique transaction processing for the This allows me to achieve the de-duplication early at the TX pool level, instead of implementing it on the runtime dispatchable. Key Changes
Benefits
|
Describe the bug
tfchain/substrate-node/pallets/pallet-smart-contract/src/billing.rs
Lines 66 to 69 in 8dfb941
I believe that this method sometimes returns silently even though the worker runs on a validator that is supposed to author the next block.
I found that there have been instances where a validator has multiple aura keys in the key store (keys were rotated), which caused this issue since we use
any_account()
. To mitigate this, I requested that ops ensure that the nodes’ local keystore has only the relevant keys. Since then, things have improved.However, we can also revise this part of the code and research if there is a better way to select the appropriate key and also make sure
is_next_block_author()
is reliable enough. This issue would also serve as a reference to a known issue in the current implementation.The text was updated successfully, but these errors were encountered: