StorageKey - why use WriteBytesWithGrouping()? #793

ixje · 2019-06-03T14:31:51Z

I'm doing an audit between neo-python and neo-cli and encountered an issue with one of the storage keys. Specifically blocks 131179 of TestNet reports a new storage item added being

{"state":"Added",
        "key":"2170e2e4a128ba6b287b14c1ad4be8d03bbf04f031313131313131313131313131313131003131313131313131313100000000000006",
        "value":"000f32323232323232323232323232323200"}]},

The key can be broken down into:

Scripthash: 2170e2e4a128ba6b287b14c1ad4be8d03bbf04f0
key_value: 313131313131313131313131313131310031313131313131313131
padding: 00000000000006 (to create 16 byte alignment)

if you look closely at the key value you'll see there is a 00 inserted after 16 bytes. The actual key (as can be seen in the screenshot) does not have this 00 byte.

This deviation is caused by WriteBytesWithGrouping() in the code below

neo/neo/Ledger/StorageKey.cs

Lines 43 to 46 in 55ffca9

 void ISerializable.Serialize(BinaryWriter writer) 

 { 

 writer.Write(ScriptHash); 

 writer.WriteBytesWithGrouping(Key);

My question is;
Why are we using this method for serialization instead of a normal write?

It gives unexpected behaviour when you use it (like done here in the StatesDumper plugin):

state["key"] = trackable.Key.ToArray().ToHexString();

I believe the key should be the real key, not some modified form.

The text was updated successfully, but these errors were encountered:

shargon · 2019-06-03T14:44:34Z

I think the same as you, should be a regular write

erikzhang · 2019-06-03T18:32:26Z

It is for "Find()". See #200

ixje · 2019-06-04T06:39:55Z

Thanks 😞

I don't see this changing anymore for 2.x but should we maybe change the approach for 3.0 to remove the confusing behaviour of ToArrray() for StorageKey?

This would mean updating Storage_Put/Delete/Get instead of using WriteBytesWithGrouping.

igormcoelho · 2019-06-05T18:44:53Z

I always wanted to know this! hahaha :)
Well, this is a storage specific thing or is it necessarily part of the protocol? I mean, other clients could perhaps not implement this and get the same results with Find perhaps?

ixje · 2019-06-05T19:21:44Z

It's a client thing. Neo-python doesn't use it and Find works fine because we fixed Put/Delete/Get instead

jsolman · 2019-06-06T02:41:03Z

It has to do with the way the length is encoded. I had to deal with this in some of the plugins I wrote. It should be able to get the length from LevelDB though and not need to encode the length into the key. I didn’t deep dive to fix it, but it should be fixable for NEO 3.0. As long as the storage engine can save and report key length; which it can, there isn’t a good reason to do what is being done.

* fix issue neo-project#791 * follow up * follow up * add example

roman-khimov · 2020-04-13T11:59:57Z

May I ask to reopen this bug? I think the issue is still relevant for NEO 3, but now with #1383 changes it has deeper impact for node compatibility. In neo-go we also have implemented a simpler key serialization scheme without grouping and it works fine (some conversion is needed to compare our 2.0 state with neo-storage-audit, but that's a minor annoyance affecting just that --- state change dumps), but now if MPT is to be added using this scheme then we will have to either change the key serialization format or make conversions for MPT calculations. None of these options make me happy as I think it should be trivial to fix it in C# code (and simplify C# code at the same time).

shargon · 2020-04-13T13:03:42Z

Yes, we should create a standard. Could you share your neo-go group impementation?

roman-khimov · 2020-04-13T13:37:42Z

We just don't do any grouping, we append byte representation of the key to the scripthash, like this:
https://github.com/nspcc-dev/neo-go/blob/405fa9c411637a1b77a98132139e177279598332/pkg/core/dao/dao.go#L466

And it works for Find well in that if there are keys "ab" and "ac" in the DB Find returns them when searching for prefix a.

shargon · 2020-04-13T16:10:15Z

Please check this thread #199 the problem for us is WriteVarBytes that it could produce undesired results. I think that store should return a fixed size, without using WriteVarBytes but it was designed for working with streams.

erikzhang · 2020-04-13T16:48:51Z

but it was designed for working with streams.

Exactly!

roman-khimov · 2020-04-13T17:15:23Z

Please check this thread #199 the problem for us is WriteVarBytes

I've seen that, but we do not use WriteVarBytes for keys and I still can't see anything preventing C# code from doing the same thing.

but it was designed for working with streams.

Well, if it's just to make this StorageKey ISerializable then yes, it does make some sense, but then the question is --- do we really need ISerializable here?

Update: I see DataCache needs it, still there should be some way.

I mean, from pure DB design perspective (especially given that after #1383 every implementation will have to deal with this format one way or another) this doesn't seem to be necessary to do the job, so making this transformation a standard required for proper MPT implementation doesn't seem to be a good solution to me, there should be some room for more efficient implementations.

roman-khimov · 2020-04-13T18:17:26Z

OK, so we have stream-based ISerializable and DataCache using that both for keys and values. All good, but we have variable-length StorageKey that can't use our nice WriteVarBytes because of Storage.Find. The StorageKey is only used as a key for the DB.

What I'm thinking about is that every stream has an end, one way or another. So I can't see anything preventing us from creating a ReadTillEOF helper that can be used for StorageKey. Writing never was a problem really, Write will do it well, the only problem is reading. But as we never read StorageKey from the network, we only read it from our own DB, simple reading till EOF (or even EndOfStreamException) shouldn't be an issue.

Pros:

~5 lines for ReadTillEOF would allow to remove some tens of lines of otherwise unused ReadBytesWithGrouping/WriteBytesWithGrouping code
shorter keys!
less processing required
simplifies MPT calculation

Cons:

sorry, I don't see any

What do you think?

erikzhang · 2020-04-13T19:29:16Z

What I'm thinking about is that every stream has an end, one way or another.

Network stream hasn't an end.

roman-khimov · 2020-04-13T20:01:31Z

Luckily we're not dealing with networking in this particular case. I do understand that it makes StorageKey somewhat impure wrt general properties expected from a Serializable thing (like you can't put two StorageKey values into a stream and then read them back with TillEOF encoding), but at the same time it's a very special case where our normal encoding principles (like VarBytes) just don't work and one way or another but we have to make something special for this particular case. I think TillEOF solution is better in that it's simpler and solves the problem.

Also, again, it's one thing for an implementation to have whatever encoding scheme it wants to and it's completely another thing to standardize something at the protocol level. So what I'm really concerned about is MPT calculation (which is soon to be a part of the protocol), I think it'd be nice to have simpler encoding used for it even if BytesWithGrouping scheme is to remain in the C# implementation.

erikzhang · 2020-04-14T08:21:32Z

#1566

erikzhang added the question Used in questions label Jun 3, 2019

ixje mentioned this issue Jun 14, 2019

BigInteger to Hex in most compressed format (NeoVM 3) neo-project/neo-vm#170

Closed

ixje closed this as completed Jun 14, 2019

Thacryba pushed a commit to simplitech/neo that referenced this issue Feb 17, 2020

fix issue neo-project#791 (neo-project#793)

d8f8f9a

* fix issue neo-project#791 * follow up * follow up * add example

roman-khimov mentioned this issue Apr 13, 2020

State difference at mainnet's block 3474317 nspcc-dev/neo-go#822

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

StorageKey - why use WriteBytesWithGrouping()? #793

StorageKey - why use WriteBytesWithGrouping()? #793

ixje commented Jun 3, 2019 •

edited

Loading

shargon commented Jun 3, 2019

erikzhang commented Jun 3, 2019 •

edited

Loading

ixje commented Jun 4, 2019

igormcoelho commented Jun 5, 2019

ixje commented Jun 5, 2019

jsolman commented Jun 6, 2019 •

edited

Loading

roman-khimov commented Apr 13, 2020

shargon commented Apr 13, 2020

roman-khimov commented Apr 13, 2020

shargon commented Apr 13, 2020

erikzhang commented Apr 13, 2020

roman-khimov commented Apr 13, 2020 •

edited

Loading

roman-khimov commented Apr 13, 2020

erikzhang commented Apr 13, 2020

roman-khimov commented Apr 13, 2020

erikzhang commented Apr 14, 2020

StorageKey - why use WriteBytesWithGrouping()? #793

StorageKey - why use WriteBytesWithGrouping()? #793

Comments

ixje commented Jun 3, 2019 • edited Loading

shargon commented Jun 3, 2019

erikzhang commented Jun 3, 2019 • edited Loading

ixje commented Jun 4, 2019

igormcoelho commented Jun 5, 2019

ixje commented Jun 5, 2019

jsolman commented Jun 6, 2019 • edited Loading

roman-khimov commented Apr 13, 2020

shargon commented Apr 13, 2020

roman-khimov commented Apr 13, 2020

shargon commented Apr 13, 2020

erikzhang commented Apr 13, 2020

roman-khimov commented Apr 13, 2020 • edited Loading

roman-khimov commented Apr 13, 2020

erikzhang commented Apr 13, 2020

roman-khimov commented Apr 13, 2020

erikzhang commented Apr 14, 2020

ixje commented Jun 3, 2019 •

edited

Loading

erikzhang commented Jun 3, 2019 •

edited

Loading

jsolman commented Jun 6, 2019 •

edited

Loading

roman-khimov commented Apr 13, 2020 •

edited

Loading