Skip to content
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.

Rpc with --no-ancient-blocks fails with truffle-based deployment on 2.7.2+ #11645

Open
illya-havsiyevych opened this issue Apr 21, 2020 · 29 comments
Labels
F3-annoyance 💩 The client behaves within expectations, however this “expected behaviour” itself is at issue.

Comments

@illya-havsiyevych
Copy link

illya-havsiyevych commented Apr 21, 2020

After upgrade truffle seems is not able to detect mined transactions and migration stuck

  • parity: 2.7.2 and OpenEthereum: v3.0.0-alpha.1-nightly-d4b5720-20200401
  • Operating system: Linux, Ubuntu 18
  • Installation: binary install
  • Fully synchronized: no, gateway mode
  • Network: custom PoA
  • Restarted: yes

Expected behavior (works fine with parity: 2.5.13)

node_modules/.bin/truffle migrate --network test
You can improve web3's peformance when running Node.js versions older than 10.5.0 by installing the (deprecated) scrypt package in your project

Compiling your contracts...
===========================
> Everything is up to date, there is nothing to compile.

Starting migrations...
======================
> Network name:    'test'
> Network id:      43
> Block gas limit: 67108864 (0x4000000)

1_initial_migration.js
======================

   Deploying 'Migrations'
   ----------------------
   > transaction hash:    0x81630621c7a822e21c345abeb97c1fb5b6526f1f6e04b1b1662dfb2f2aec1dc3
   > Blocks: 2            Seconds: 4
   > contract address:    0x73E65c8B69B13564b96F5033A8a6C57052F042dA
   > block number:        9054936
   > block timestamp:     1587489850
   > account:             0x5C93042C9f3a18059C19Fb253376911AAA984C1F
   > balance:             0
   > gas used:            263677 (0x405fd)
   > gas price:           0 gwei
   > value sent:          0 ETH
   > total cost:          0 ETH


   > Saving migration to chain.
   > Saving artifacts
   -------------------------------------
   > Total cost:                   0 ETH
...
all migrated fine

After upgrade truffle seems is not able to detect mined transactions and migration stuck

parity: 2.7.2 and OpenEthereum: v3.0.0-alpha.1-nightly-d4b5720-20200401

node_modules/.bin/truffle migrate --network test
You can improve web3's peformance when running Node.js versions older than 10.5.0 by installing the (deprecated) scrypt package in your project

Compiling your contracts...
===========================
> Everything is up to date, there is nothing to compile.

Starting migrations...
======================
> Network name:    'test'
> Network id:      43
> Block gas limit: 67108864 (0x4000000)

1_initial_migration.js
======================

   Deploying 'Migrations'
   ----------------------
   > transaction hash:    0x5b6526f1f6e04b1b1662dfb2f2aec1dc381630621c7a822e21c345abeb97c1fb
   > Blocks: <INFINITY>            Seconds: <INFINITY>
...
stuck above

Command line:

 parity --base-path=./node --chain=./config/chain.json --config=./config/gateway.toml --bootnodes="<>" --no-ancient-blocks --unlock=5c93...4c1f --password=./node/password.txt

Config:

[parity]
chain = "./config/chain.json"
mode = "active"
auto_update_delay = 1000
auto_update_check_frequency = 1000
release_track = "stable"

[network]
port = 30303
discovery = true
allow_ips = "all"
reserved_only = false

[rpc]
disable = false
port = 8501
interface = "all"
cors = ["all"]
apis = ["web3", "eth", "pubsub", "net", "parity", "parity_set", "parity_pubsub", "rpc", "personal"]
hosts = ["all"]

[websockets]
disable = false
port = 8502
interface = "all"
apis = ["web3", "eth", "pubsub", "net", "parity", "parity_set", "parity_pubsub", "rpc"]
hosts = ["all"]

[ipc]
disable = false
apis = ["web3", "eth", "pubsub", "net", "parity", "parity_set", "parity_pubsub", "rpc"]

[dapps]
disable = true

[secretstore]
disable = true

[mining]
force_sealing = false
reseal_on_txs = "all"
reseal_min_period = 700
reseal_max_period = 900
work_queue_size = 1024
relay_set = "lenient"
usd_per_tx = "0"
usd_per_eth = "0"
price_update_period = "hourly"
gas_floor_target = "0x4000000"
gas_cap = "0"
tx_queue_size = 16384
tx_queue_per_sender = 4096
tx_queue_mem_limit = 0
tx_gas_limit = "0x4000000"
tx_time_limit = 500

[footprint]
tracing = "auto"
pruning = "auto"
pruning_history = 128
pruning_memory = 64
cache_size_db = 1024
cache_size_blocks = 64
cache_size_queue = 256
cache_size_state = 256
db_compaction = "ssd"
fat_db = "auto"
scale_verifiers = true
num_verifiers = 2

[snapshots]
disable_periodic = true

chain:
PoA network with 5 validators and following params:

  "params": {
    "networkID": "0x2b",
    "maximumExtraDataSize": "0x20",
    "minGasLimit": "0x1388",
    "gasLimitBoundDivisor": "0x400",
    "wasmActivationTransition": 0,
    "maxTransactionSize": "0x4b000",
    "eip150Transition": 0,
    "eip160Transition": 0,
    "eip161abcTransition": 0,
    "eip161dTransition": 0,
    "eip98Transition": 0,
    "eip658Transition": 0,
    "eip155Transition": 0,
    "validateReceiptsTransition": 0,
    "validateChainIdTransition": 0,
    "eip140Transition": 0,
    "eip211Transition": 0,
    "eip214Transition": 0,
    "eip145Transition": 0,
    "eip1014Transition": 0,
    "eip1052Transition": 0,
    "maxCodeSizeTransition": 0,
    "maxCodeSize": "0x6000"
  },
@adria0
Copy link

adria0 commented Apr 23, 2020

@illya-havsiyevych, I checked and is working for me for 2.5.13, 2.7.2 and 3.0.0-alpha with the same configuration, please take a look to https://github.com/adria0/oe_stuff/tree/master/issue_11645, first start the chain with oe/start.sh (need to put the node path there) and run the migration with truffle/start.sh

@illya-havsiyevych
Copy link
Author

thanks for your effort, for sure will check till end of the day
might push some changes to your config

@illya-havsiyevych
Copy link
Author

illya-havsiyevych commented Apr 29, 2020

  • modified configs to be a bit closer to our env, config update adria0/oe_stuff#1
  • still not able to reproduce the issue on a new network, but probably will be able to do it after snapshot got created
  • stay tuned

@illya-havsiyevych
Copy link
Author

illya-havsiyevych commented Apr 29, 2020

ok, more update - we have the issue when a node started to respond the following way
i.e. "Looks like you disabled ancient block download, unfortunately the information you're trying to fetch doesn't exist in the db and is probably in the ancient blocks."

   ⠹ Blocks: 0            Seconds: 0   > {
   >   "jsonrpc": "2.0",
   >   "id": 18,
   >   "method": "eth_getTransactionReceipt",
   >   "params": [
   >     "0x742889b32908fe1c89ba30084bbc8001be2c9f01c06289c5f5b8dc84312805c4"
   >   ]
   > }
 <   {
 <     "jsonrpc": "2.0",
 <     "error": {
 <       "code": -32000,
 <       "message": "Looks like you disabled ancient block download, unfortunately the information you're trying to fetch doesn't exist in the db and is probably in the ancient blocks."
 <     },
 <     "id": 18
 <   }

still don't know how long would it take to reproduce on this test env
probably once warp sync would take place

@illya-havsiyevych
Copy link
Author

@illya-havsiyevych
Copy link
Author

illya-havsiyevych commented Apr 30, 2020

@adria0
The issue could be reproduced now.
Plz check the pull.
You need node data for 35k+ blocks.

> ./clean.up   - unzip nodes data
> ./start.sh   - start 3 PoA nodes in backgound

wait nodes are in sync

./gw.sh         - start `gw-node` in foreground

and in other console - try truffle migration

@adria0
Copy link

adria0 commented May 4, 2020

Ok, yes, there is a regression there in 2.5.13 to 2.7.2

@adria0 adria0 added the F2-bug 🐞 The client fails to follow expected behavior. label May 4, 2020
@adria0 adria0 changed the title Truffle-based deployment starts to fail when upgrading parity to 2.7.2+ Rpc with --no-ancient-blocks --warp-barrier fails with truffle-based deployment on 2.7.2+ May 4, 2020
@illya-havsiyevych
Copy link
Author

illya-havsiyevych commented May 5, 2020

--warp-barrier is only needed to reproduce the issue faster. In usual life warp is always On and we just may have an issue sooner or later

@adria0
Copy link

adria0 commented May 5, 2020

@illya-havsiyevych, so it seems that the problem is with --no-ancient-blocks, then?

@illya-havsiyevych
Copy link
Author

illya-havsiyevych commented May 5, 2020

yes, so the issue is in following

given

  • a network with warp sync and periodical snapshots enabled

when

  • some node started / restarted and snapshot-based warp sync take place

and

  • --no-ancient-blocks param used

and

  • version is `2.7.2+`
    

then

  • we have an issue

@adria0 adria0 changed the title Rpc with --no-ancient-blocks --warp-barrier fails with truffle-based deployment on 2.7.2+ Rpc with --no-ancient-blocks fails with truffle-based deployment on 2.7.2+ May 5, 2020
@adria0
Copy link

adria0 commented May 5, 2020

yes, so the issue is in following

given

* a network with warp sync and periodical snapshots enabled

when

* some node started / restarted and snapshot-based warp sync take place

and

* `--no-ancient-blocks` param used

and

* ```
  version is `2.7.2+`
  ```

then

* we have an issue

@dvdplm, any idea

@adria0 adria0 closed this as completed May 5, 2020
@adria0 adria0 reopened this May 5, 2020
@illya-havsiyevych
Copy link
Author

@adria0
Copy link

adria0 commented May 5, 2020

Ok, I checked the logs for RPC in gateway and transaction is sent and mined sucessfully.

Truffle tries to retrieve the transaction receipt before is mined and the following error is returned:

2020-05-05 11:33:28 UTC http.worker30 DEBUG rpc  Response: {"jsonrpc":"2.0","error":{"code":-32000,"message":"Looks like you disabled ancient block download, unfortunately the information you're trying to fetch doesn't exist in the db and is probably in the ancient blocks."},"id":17}.

it tries twice and stops trying it again. I checked this error message and was introduced in 2.6 https://github.com/openethereum/openethereum/pull/10608

@illya-havsiyevych
Copy link
Author

illya-havsiyevych commented May 5, 2020

Ok, from quick check it looks like it DID help

But can we check the code logic itself. So I'm here https://github.com/openethereum/openethereum/blob/master/rpc/src/v1/impls/eth.rs#L836
with transaction receipt and by some reason checking .and_then(errors::check_block_gap(&*self.client, self.options));
without passing a receipt
https://github.com/openethereum/openethereum/blob/master/rpc/src/v1/impls/eth.rs#L836
to only check block_gap. Then you need to help me with rust magic as I'm freshman here - how did we get response ?

	move |response| {
		if response.is_none() {

https://github.com/openethereum/openethereum/blob/d778558dcccf25cd6a0654f16c73bde7758b0c32/rpc/src/v1/helpers/errors.rs#L257

or in master
https://github.com/openethereum/openethereum/blob/master/rpc/src/v1/helpers/errors.rs#L290

So I guess, that using of check_block_gap for validating a transaction_receipt's response https://github.com/openethereum/openethereum/blob/master/rpc/src/v1/impls/eth.rs#L826
might be invalid

@illya-havsiyevych
Copy link
Author

... also we (and others) are using not only rpc. We do use ws and ipc. But config option allow_missing_blocks is available only for rpc

@adria0
Copy link

adria0 commented May 5, 2020

	move |response| {
		if response.is_none() {

check_block_gap is a high order generic function, so a function that returns a function parametrized. Afais, in this case the variable response if of type T

@adria0 adria0 added F3-annoyance 💩 The client behaves within expectations, however this “expected behaviour” itself is at issue. and removed F2-bug 🐞 The client fails to follow expected behavior. labels May 5, 2020
@illya-havsiyevych
Copy link
Author

   > {
   >   "jsonrpc": "2.0",
   >   "id": 63,
   >   "method": "eth_getTransactionReceipt",
   >   "params": [
   >     "0x158c9d7827ecfd10f243d35c1a2186a1eee1ebed5b973b81910d06573c8d09a8"
   >   ]
   > }
 <   {
 <     "jsonrpc": "2.0",
 <     "result": {
 <       "blockHash": "0x25be28b3c382ffeb146e5e7589856169fac6c94d77b06a8165ad10b717b9ef9f",
 <       "blockNumber": "0x9077",
 <       "contractAddress": "0xd9096d2473506e7aa0686d3b95dc9ab33e684bc6",
 <       "cumulativeGasUsed": "0x2e043",
 <       "from": "0xcfa3ae1840e38d1e54b0ef6300d6e91b22964a75",
 <       "gasUsed": "0x2e043",
 <       "logs": [],
 <       "logsBloom": "0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
 <       "status": "0x1",
 <       "to": null,
 <       "transactionHash": "0x74293cb611600f3361bd7ec43db6d1acb8a27b76ae1db1bc6e282f18d366250c",
 <       "transactionIndex": "0x0"
 <     },
 <     "id": 60
 <   }

Here is a log (if I forced allow_missing_blocks = true).

I'm assuming some compatible presentation of transaction_receipt is used inside check_block_gap as response at https://github.com/openethereum/openethereum/blob/master/rpc/src/v1/helpers/errors.rs#L290

So why if response.is_none() is true ?
Is it really transaction_receipt ?

@illya-havsiyevych
Copy link
Author

... and again if response is expected to be the block response - then the usage of check_block_gap is incorrect in this case and has to be removed from https://github.com/openethereum/openethereum/blob/master/rpc/src/v1/impls/eth.rs#L826

@adria0
Copy link

adria0 commented May 5, 2020

Here is a log (if I forced allow_missing_blocks = true).

How do you did it? via configuration parameters or modifying the code? Truffle deployment works now? Using IPC/ws?

@adria0 adria0 closed this as completed May 5, 2020
@adria0 adria0 reopened this May 5, 2020
@illya-havsiyevych
Copy link
Author

PS. any reason to close / reopen ?

@illya-havsiyevych
Copy link
Author

illya-havsiyevych commented May 5, 2020

sec, will send a pull

@adria0
Copy link

adria0 commented May 5, 2020

PS. any reason to close / reopen ?

Touchpad problems under Sway, mainly 😅

@illya-havsiyevych
Copy link
Author

illya-havsiyevych commented May 5, 2020

@adria0 Plz check adria0/oe_stuff#2
So using of allow_missing_blocks = true allows to avoid issue with HTTP based RPC ONLY
All other providers (most important for us) are still having the issues

@illya-havsiyevych
Copy link
Author

PS. we have no plans to fork and fix in a code

@illya-havsiyevych
Copy link
Author

any updates here please ?

@adria0
Copy link

adria0 commented May 19, 2020

@illya-havsiyevych, in the weekly call (it is open, you can participate) we talk about which issues to work on. (Very) generally speaking, I think that we prioritize

  • Critical issues
  • EIPS to implement for the next network upgrade
  • Issues that have a general impact

Some of the issues take weeks to solve, and the queue of items grows day by day, so we are asking the community help on this.

Is it possible for you just to use the node without --no-ancient-blocks?

@illya-havsiyevych
Copy link
Author

illya-havsiyevych commented May 19, 2020

In our use case - we often assume node in gw-mode is lightweight, easy and fast to boot and --no-ancient-blocks has huge impact here, i.e:

  • with it node in a network with 17m blocks is up, ready and stops any backgrounds sync in mins
  • without it - it takes hours

Community help - we might do it at some point, rust is not in our skills set so it might take us some time to step in.

In any case - knowing a fair ETA is a key. If F3-annoyance really means "you'll never going to fix
that" - it will be a signal for us to react faster

@adria0
Copy link

adria0 commented May 19, 2020

In any case - knowing a fair ETA is a key. If F3-annoyance really means "you'll never going to fix
that" - it will be a signal for us to react faster

I agree with you that the wiki should explain how the tasks are chosen, it will help.
F3-annoyance also means "I'm going to fix it because there are not any critical tasks to do", but I am not in this situation.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
F3-annoyance 💩 The client behaves within expectations, however this “expected behaviour” itself is at issue.
Projects
None yet
Development

No branches or pull requests

2 participants