-
Notifications
You must be signed in to change notification settings - Fork 20.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
trie: refactor stacktrie #28233
trie: refactor stacktrie #28233
Conversation
This change refactors stacktrie to separate the stacktrie itself from the internal representation of nodes: a stacktrie is not a recursive structure of stacktries, rather, a framework for representing and operating upon a set of nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
trie/stacktrie.go
Outdated
func (st *StackTrie) Reset() { | ||
st.owner = common.Hash{} | ||
st.writeFn = nil | ||
func (st *stNode) Reset() *stNode { | ||
st.key = st.key[:0] | ||
st.val = nil | ||
for i := range st.children { | ||
st.children[i] = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent a long time stuck on this.
Since we just set the children to nil, we could loose a lot of *stNodes that are not in the pool. However I finally came to the conclusion that this is not an issue since we only ever return *stNodes to the pool on
- hashRec of extNodes, which have no children themselves (extNode)
- hashRec of branchNodes, where we hashRec'd the children beforehand, thus returned the *stNodes already
So we will not loose allocations here (afaict)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is probably not even needed. I tested this:
if st.children[i] != nil {
panic(fmt.Sprintf("Child was %T", st.children[i]))
}
But none of the tests triggered it. So the loop could probably be removed, but OTOH I think it's sane to have it there, to prevent potential mem leaks. It kind of makes sense that the pool-handling is robust in itself, without trusting too much that the "other parts" is correct.
Actually |
trie/stacktrie.go
Outdated
} | ||
return nil | ||
func (stack *StackTrie) Reset() { | ||
stack.owner = (common.Hash{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to keep the owner and writeFn unchanged.
I guess the semantics of "Reset" is something we want to debate or discuss. Basically, there are two options:
Reset the referenced nodes but keep the settings for this trie (owner, writeFn, etc).
Reset the referenced nodes and set all the settings to nil.
The first option sounds better to me, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the owner and writeFn unchanged.
I think that is dangerous. The idea of Reset
is that it can be called before entering it into the pool, where the object may live indefinitely. So we need to clear out anything which may hold memory references, and IMO the writeFn
is very much a likely culprit to have references to local variables in local scopes.
The owner
we might do away with completely, as you pointed out below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's stackTrie, not stackNode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, regardless: an object that is newly Reset
should not hold external references, IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a good refactor at the first glance, have a few nitpicks, will check the details tomorrow.
Btw, we should run stackTrie fuzzer a bit to ensure nothing is broken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, but please run fuzzer/stackTrie
and maybe snap sync to ensure nothing is broken
Originally I think the binary marshalling was implemented so that we could pause/resume during sync. We didn't make use of that. @karalabe do you have any preference, whether to keep the feature or remove it? I am a bit loathe to keep maintaining behaviour that we don't use (and have no reason to believe that anyone else is using either) |
Deployed it on 5/6, successfully finished the snap sync. |
This change refactors stacktrie to separate the stacktrie itself from the internal representation of nodes: a stacktrie is not a recursive structure of stacktries, rather, a framework for representing and operating upon a set of nodes. --------- Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This change refactors stacktrie to separate the stacktrie itself from the internal representation of nodes: a stacktrie is not a recursive structure of stacktries, rather, a framework for representing and operating upon a set of nodes. --------- Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This change refactors stacktrie to separate the stacktrie itself from the internal representation of nodes: a stacktrie is not a recursive structure of stacktries, rather, a framework for representing and operating upon a set of nodes. --------- Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This reverts commit b5cf9ab.
This reverts commit b5cf9ab.
This change refactors stacktrie to separate the stacktrie itself from the internal representation of nodes: a stacktrie is not a recursive structure of stacktries, rather, a framework for representing and operating upon a set of nodes.
I think it makes it easier to reason about, and also easier to deal with shared assets such as a hasher, the 'owner', and 'writeFunc' by having an envelope, rather than passing these things along at every constructor.
This way of doing it also opens up for some new optimizations: e.g. we could do without global pool for nodes, and use a local non-threadsafe pool which is only accessible internally. (Example: 8418248)
This PR is not yet complete, and it also does away with the binary serialization. Not sure if it's something we want to keep: unused code has a tendency to bitrot.The binary serialization has been added back, in it's own file. However, since the 'owner' of the trie now is moved to top level and not repeated in every node, the format was changed. Thus, I removed the original constructor so that this format-change must cause a compiler failure. But in general, I don't know if this really is something we want to kee, since we're not using it ourselves.We don't have an awful lot of benchmarks for the stacktrie, afaict only one really, which indicates a slight improvement in this PR (as for why the mem went up, that's curious, I can probably get that down again).