Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: add code to witness when state object is accessed #30698

Merged
merged 1 commit into from
Oct 31, 2024

Conversation

lightclient
Copy link
Member

@lightclient lightclient commented Oct 30, 2024

I think the core code should generally be agnostic about the witness and the statedb layer should determine what elements need to be included in the witness. Because code is accessed via GetCode, and GetCodeLength, the statedb will always know when it needs to add that code into the witness.

The edge case is block hashes, so we continue to add them manually in the implementation of BLOCKHASH.

It probably makes sense to refactor statedb so we have a wrapped implementation that accumulates the witness, but this is a simpler change that makes #30078 less aggressive.

Copy link
Contributor

@jwasinger jwasinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One edge case I can think of where this could unnecessarily include code in the witness is accessing the balance of an account via the BALANCE opcode. There may be others, but I'm not sure.

That being said, the motivation behind this makes sense to me, it just seems like it may require a bit more refactoring within the statedb to get it right.

@jwasinger
Copy link
Contributor

I guess we don't need to attach code unless it's executed or the length is needed. The codehash can be trusted because we already trust the parent root.

@karalabe
Copy link
Member

I'm not sure this change is all that good. The original code does the minimum work and gathers the minimum data. This PR on one hand collects more data than needed (e.g. @jwasinger's case) and also will hit this code load path on every storage slot modification (we "get" the account as the first step).

@karalabe
Copy link
Member

Note, I'm happy to move the collector methods form the opcodes into the respective statedb calls, just not this high up the stack imo.

@namiloh
Copy link

namiloh commented Oct 30, 2024

Did you try putting it in a separate layer? Curious how that would work...

@lightclient
Copy link
Member Author

Ah right I put the collection way too high up in the stack. I've move it to GetCode and GetCodeSize.

Did you try putting it in a separate layer? Curious how that would work...

I started doing this and then realized I would need to come up with a more comprehensive interface for statedb since core/blockchain simply operates on *state.StateDB. It still seems possible to make it as a wrapper, but at least to fix separation concerns that were affecting my other PR, this change here should be enough.

Copy link
Member

@rjl493456442 rjl493456442 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm

@rjl493456442 rjl493456442 added this to the 1.14.12 milestone Oct 31, 2024
Copy link
Contributor

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes looks good to me and I approve.

It's a bit annoying, however, that remaining little witness-invocation inside the instructions.

I took a stab at removing it, got this far:

diff --git a/core/evm.go b/core/evm.go
index 5d3c454d7c..f700c54c9c 100644
--- a/core/evm.go
+++ b/core/evm.go
@@ -22,6 +22,7 @@ import (
 	"github.com/ethereum/go-ethereum/common"
 	"github.com/ethereum/go-ethereum/consensus"
 	"github.com/ethereum/go-ethereum/consensus/misc/eip4844"
+	"github.com/ethereum/go-ethereum/core/stateless"
 	"github.com/ethereum/go-ethereum/core/tracing"
 	"github.com/ethereum/go-ethereum/core/types"
 	"github.com/ethereum/go-ethereum/core/vm"
@@ -65,7 +66,7 @@ func NewEVMBlockContext(header *types.Header, chain ChainContext, author *common
 	return vm.BlockContext{
 		CanTransfer: CanTransfer,
 		Transfer:    Transfer,
-		GetHash:     GetHashFn(header, chain),
+		GetHash:     GetHashFn(header, chain, nil),
 		Coinbase:    beneficiary,
 		BlockNumber: new(big.Int).Set(header.Number),
 		Time:        header.Time,
@@ -91,12 +92,11 @@ func NewEVMTxContext(msg *Message) vm.TxContext {
 }
 
 // GetHashFn returns a GetHashFunc which retrieves header hashes by number
-func GetHashFn(ref *types.Header, chain ChainContext) func(n uint64) common.Hash {
+func GetHashFn(ref *types.Header, chain ChainContext, witness *stateless.Witness) func(n uint64) common.Hash {
 	// Cache will initially contain [refHash.parent],
 	// Then fill up with [refHash.p, refHash.pp, refHash.ppp, ...]
 	var cache []common.Hash
-
-	return func(n uint64) common.Hash {
+	getter := func(n uint64) common.Hash {
 		if ref.Number.Uint64() <= n {
 			// This situation can happen if we're doing tracing and using
 			// block overrides.
@@ -127,6 +127,16 @@ func GetHashFn(ref *types.Header, chain ChainContext) func(n uint64) common.Hash
 		}
 		return common.Hash{}
 	}
+	// If we need to build a witness: wrap the getter and also invoke the witness.
+	// Note: this method is only invoked with santized values in the allowed
+	// range of block numbers.
+	if witness != nil {
+		return func(n uint64) common.Hash {
+			witness.AddBlockHash(n)
+			return getter(n)
+		}
+	}
+	return getter
 }
 
 // CanTransfer checks whether there are enough funds in the address' account to make a transfer.
diff --git a/core/vm/instructions.go b/core/vm/instructions.go
index 45139e8217..c5dda9e4d5 100644
--- a/core/vm/instructions.go
+++ b/core/vm/instructions.go
@@ -445,9 +445,6 @@ func opBlockhash(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) (
 	}
 	if num64 >= lower && num64 < upper {
 		res := interpreter.evm.Context.GetHash(num64)
-		if witness := interpreter.evm.StateDB.Witness(); witness != nil {
-			witness.AddBlockHash(num64)
-		}
 		num.SetBytes(res[:])
 	} else {
 		num.Clear()
diff --git a/core/vm/interface.go b/core/vm/interface.go
index 9229f4d2cd..3fb5dc6ae7 100644
--- a/core/vm/interface.go
+++ b/core/vm/interface.go
@@ -20,7 +20,6 @@ import (
 	"math/big"
 
 	"github.com/ethereum/go-ethereum/common"
-	"github.com/ethereum/go-ethereum/core/stateless"
 	"github.com/ethereum/go-ethereum/core/tracing"
 	"github.com/ethereum/go-ethereum/core/types"
 	"github.com/ethereum/go-ethereum/params"
@@ -94,8 +93,6 @@ type StateDB interface {
 	AddLog(*types.Log)
 	AddPreimage(common.Hash, []byte)
 
-	Witness() *stateless.Witness
-
 	// Finalise must be invoked at the end of a transaction
 	Finalise(bool)
 }

If we drop that one, we can drop Witness from the interface. However, what's lacking in my code above is that when we construct the GetHash function, we don't have access to the witness. We typically construct the block context prior to instantiating a statedb.

However, wouldn't it be logical to have the block witness be a part of the block context?

I don't want to just graft a thing onto another thing, to make it work, but actually make sure to place it where it is sensible. WDYT? Can we take this PR one step further?

@karalabe karalabe merged commit 9afb18d into ethereum:master Oct 31, 2024
3 checks passed
GrapeBaBa pushed a commit to optimism-java/shisui that referenced this pull request Nov 2, 2024
I think the core code should generally be agnostic about the witness and
the statedb layer should determine what elements need to be included in
the witness. Because code is accessed via `GetCode`, and
`GetCodeLength`, the statedb will always know when it needs to add that
code into the witness.

The edge case is block hashes, so we continue to add them manually in
the implementation of `BLOCKHASH`.

It probably makes sense to refactor statedb so we have a wrapped
implementation that accumulates the witness, but this is a simpler
change that makes ethereum#30078 less aggressive.
holiman pushed a commit that referenced this pull request Nov 19, 2024
I think the core code should generally be agnostic about the witness and
the statedb layer should determine what elements need to be included in
the witness. Because code is accessed via `GetCode`, and
`GetCodeLength`, the statedb will always know when it needs to add that
code into the witness.

The edge case is block hashes, so we continue to add them manually in
the implementation of `BLOCKHASH`.

It probably makes sense to refactor statedb so we have a wrapped
implementation that accumulates the witness, but this is a simpler
change that makes #30078 less aggressive.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants