Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stateless/pure transaction processing #195

Closed
4 tasks done
pipermerriam opened this issue Dec 5, 2017 · 9 comments
Closed
4 tasks done

Stateless/pure transaction processing #195

pipermerriam opened this issue Dec 5, 2017 · 9 comments
Assignees
Labels

Comments

@pipermerriam
Copy link
Member

pipermerriam commented Dec 5, 2017

We need a pure function for applying transactions.

As inputs it would take:

  • the current state (including state root, tx index, etc)
  • the transaction to be processed
  • a witness proof for all of the parts of the state that the transaction can/will touch.

It would output the resulting state.

Implementation is up-in-the-air but current leading idea is to process the transaction as normal, modifying the state as we go and then discarding the state changes, only returning the updated state. There is an underlying assumption that our state database engine supports lazy fetching of parts of the overall state.


Tasks

@pipermerriam pipermerriam mentioned this issue Dec 5, 2017
15 tasks
@hwwhww hwwhww added the eth2.0 label Dec 5, 2017
@pipermerriam pipermerriam changed the title STUB: Stateless/pure transaction processing Stateless/pure transaction processing Dec 5, 2017
@pipermerriam
Copy link
Member Author

Thoughts on implementation. I suspect that one of the first things this will need is a database wrapper which keeps track of all of the parts of the state trie which get touched as part of transaction processing. I think it would be good to do this as a subclass of evm.db.state.StateDB. It may also require some changes to the ChainDB class since it is the entry point for interacting with the state database.

  • The StateDB would be responsible for collecting all the touched keys.
  • The ChainDB would be responsible for persisting the set of all of the touched keys each time the statedb is used.

@hwwhww
Copy link
Contributor

hwwhww commented Dec 7, 2017

@pipermerriam

The ChainDB would be responsible for persisting the set of all of the touched keys each time the statedb is used.

Q1: Is the ChainDB here means BaseChainDB class? Or we have to abstract BaseChainDB and create a new ChainDB subclass?

Q2: pure function for applying transactions feature is for both shard chain and main chain?

If we make both types of chains have purified add_transaction function, something different if that it returns State in main chain and StateObj in shard chain where StateObj only contains properties.

Q3: What would purified function need

In our prerious Q&A, you mentioned about the minimal changes to the current apply_transaction API: That statement was in reference to these pieces of functionality and anything that allows it to access external APIs directly like the chaindb. This would not be a significant architecture change.

Do you mean we modify evm.vm.base.VM.apply_transaction?
^^^^ I may fully mistook, but if it is, that is for non-stateless case:
evm.vm.base.VM.apply_transaction: py-evm/base.py at master · ethereum/py-evm · GitHub

+  def apply_transaction(prev_state, transaction, db):
+       self.chaindb = db
 
        # ....somehow update it to be pure function....
	
        computation = self.execute_transaction(transaction)
        self.clear_journal()
+       state = self.block.add_transaction(transaction, computation, db)
        return computation, state
# Usage: use VM as a disposable container object?
vm = <latest_fork_vm>(header=header, chaindbchaindb)
db = chaindb.clone()
statedb = chaindb.get_state_db(self.block.header.state_root, read_only=False, 	stateless=True)
prev_state = State(statedb, root_hash= prev_state_obj.root_hash, read_only=False)
state = vm.apply_transaction(prev_state, transaction, db)

On the other hand, note that @vbuterin suggested that:

  1. A chain object should NOT be necessary to process a block
  2. Recent headers and hashes should be part of the state object

I am trying to combine both of your suggestions an hope we can have more pictures of the ultimate goal - the true pure function:

  1. Update evm.vm.rlp.block.BaseBlock
    • Remove BaseBlock.add_transaction function
  2. Update evm.vm.forks.frontier.blocks.FrontierBlock
    • Remove FrontierBlock.add_transaction function
  3. For main chain, add
    • Pure evm.vm.base.apply_transaction function
    • Pure evm.vm.forks.byzantium.blocks.add_transaction function
  4. For shard chain, add
    • Pure evm.vm.base.apply_shard_transaction function
    • Pure evm.vm.sharding.collations.add_transaction function
    • Pure evm.vm.base.apply_shard_transaction_stateless function
    • Pure evm.vm.sharding.collations.add_transaction_stateless function
  5. Description of add_transaction_stateless
    1. Design of add_transaction:

    2. apply_transaction triggers add_transaction

      def apply_transaction(prev_state_obj, chaindb, tx):
          statedb = chaindb.get_state_db(self.block.header.state_root, read_only=False, 	stateless=True)
          state = State(statedb, root_hash= prev_state_obj.root_hash, read_only=False)
          add_transaction(state, tx)
          state_obj = StateObj(root=state.root_hash.....)
          
          return state_obj, statedb.get_reads(), statedb.get_reads()
    3. How to call apply_transaction

      # In apply_collation
      # ......
      for tx in block.transactions:
          state_obj, _reads, _writes = apply_transaction(stateobj, db, tx)
          db = union(db, _writes)
      # .....

Q4: I want to make sure when would ChainDB need to store the set of all of the touched keys.

  1. Main chain client: they don’t need witness to apply transaction
  2. Shard archival node:
    • Cache recent tx / collation witness data
      • ^^^^^ This is the only case that ChainDB store the touched keys?
    • [TBD] Do they need to cache visited account key-value?
  3. Shard stateless client:
    • The touched keys(reads, writes set) would be union during processing txs. So the ChainDB may only need to store latest union set.

Q5: If the scenario of Q4 is right, regarding to updating BaseChainDB, my instinct for direction is:

Updating BaseChainDB.get_state_db(state_root, read_only)

    def get_state_db(self, state_root, read_only, stateless=False):
        db = StateDB(self.db) if stateless else self.db
        return State(db=db, root_hash=state_root, read_only=read_only)

And only shard client would call this function.


Sorry for so many questions! Thank you for your time.

@pipermerriam
Copy link
Member Author

pipermerriam commented Dec 8, 2017

Q1: Is the ChainDB here means BaseChainDB class?

My bad, anywhere you see ChainDB I mean BaseChainDB.

@pipermerriam
Copy link
Member Author

pipermerriam commented Dec 8, 2017

Q2: pure function for applying transactions feature is for both shard chain and main chain?

This makes sense but I didn't see a question. If there is one can you clarify?

@pipermerriam
Copy link
Member Author

pipermerriam commented Dec 8, 2017

Q3: What would purified function need

Everything you state under this section looks good and inline with my thinking. Removing the transaction logic from the Block objects seems like a nice isolated first step that can be done independently. I would suggest moving that API up into the VM class as VM.add_transaction_to_block(block, transaction). This could even be implemented in a pure form such that it doesn't mutate the block object but rather initializes a new one and returns it.

RE: vbuterin's comment: A chain object should NOT be necessary to process a block

Yes and No.

  • Yes in that the Chain class is largely just a convenience wrapper around the VM class when it comes to applying transactions.
  • No in that in the case where the list of previous headers crosses a fork boundry (early headers are in fork rules A, later headers are in fork rules B). In this case you'll need something above the VM to be able to retrieve the appropriate headers for the previous VM rules. This currently shouldn't be an issue since all of the VM classes share the same header RPL object but I suspect that will change at some point so we should be prepared for that.

@pipermerriam
Copy link
Member Author

Q4: I want to make sure when would ChainDB need to store the set of all of the touched keys.

Everything you say here is inline with my understanding.

I'm not sure if this is the right approach, but it may be useful to

@pipermerriam
Copy link
Member Author

Q5: If the scenario of Q4 is right, regarding to updating BaseChainDB, my instinct for direction is:

This looks like a solid approach but I'll point the following out. The StateDB object is ephemeral in that it comes into existence as a context manager when accessing the state is necessary and then is discarded after. That means that the ChainDB will need to be responsible for persisting the touched keys. This may be fine if the db instance passed into the StateDB is where the tracking occurs, at which point the StateDB can be blissfully unaware that all of the keys it touches are being tracked.

@pipermerriam
Copy link
Member Author

@hwwhww I think I answered everything you asked. Please follow up if you need clarification on anything.

@hwwhww
Copy link
Contributor

hwwhww commented Jan 13, 2018

close via #247

@hwwhww hwwhww closed this as completed Jan 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants