Rework decoding of `Box`es, `Rc`s, `Arc`s, arrays and enums (stack overflow fix) #426

koute · 2023-04-20T03:25:20Z

This PR reworks how we decode Boxes etc. and arrays; in particular it fixes three issues:

Trying to deserialize a Box which contains a big array doesn't overflow the stack anymore. This is fixed by first preallocating an empty Box and then directly deserializing into it, instead of first deserializing the value on the stack and moving it into the Box after it's deserialized.
Trying to deserialize big nested enums doesn't overflow the stack anymore. AFAIK this is due to an upstream bug in rustc (Match expressions use O(n) stack space with n branches in debug mode rust-lang/rust#34283) and one workaround which seems to work is to just wrap the body of each match with a lambda and immediately call it and return the value. I've also added a test for this issue.
Elements of partially read arrays are now properly dropped if the whole array can't be decoded.

I've also updated Cargo.lock and updated to criterion 0.4 while I was at it.

Fixes #419

Fixes #425

src/codec.rs

Co-authored-by: Squirrel <giles@parity.io>

tests/mod.rs

src/codec.rs

I'd prefer someone else to approve.

…erive`

…d but when `derive` is

koute · 2023-05-08T06:59:41Z

I've resolved conflicts, bumped version from 3.5.0 to 3.6.0 (since 3.5.0 was released in the meantime) and cleaned up the README.

I've also cleaned up our feature flags semantics:

The max-encoded-len now doesn't automatically enable the derive macro reexports nor pull in the parity-scale-codec-derive dependency.
Enabling the parity-scale-codec-derive without enabling the derive feature doesn't reexport the macros anymore; derive is now necessary.

These two issues are why @ggwpez made #429; now with these changes that PR should be not necessary anymore. (: (The issue was that the derive macros were accidentally imported twice, not that the derive macro and the trait were imported with the same name - that compiles just fine, so importing the derive macros and the traits under different names is not necessary.)

These are technically breaking changes, but max-encoded-len is explicitly documented an unstable, and I'd argue the derive macro reexports working without the derive feature enabled was a bug (everyone should have been enabling the derive feature), so I didn't bump the major version.

Should be good to go.

ggwpez

These two issues are why @ggwpez made #429; now with these changes that PR should be not necessary anymore. (: (The issue was that the derive macros were accidentally imported twice, not that the derive macro and the trait were imported with the same name - that compiles just fine, so importing the derive macros and the traits under different names is not necessary.)

Could you point to the line of how this is fixed now?
AFAIK it was importing scale::Decode (macro + trait) and scale_derive::Decode (macro).
So yes, the macro was imported twice, but without the derive feature the macro was not available for import from scale.

derive/src/decode.rs

koute · 2023-05-08T15:03:09Z

Could you point to the line of how this is fixed now?

The use parity_scale_codec_derive::{Encode, Decode}; was feature gated on whether the derive feature is enabled.
The reexport of Encode and Decode was was feature gated on whether the parity-scale-codec-derive feature is enabled.
Enabling the max-encoded-len feature also implicitly enabled parity-scale-codec-derive.

So basically the use parity_scale_codec_derive::{Encode, Decode}; was enabled because it thought that the derive reexports were not enabled (because the derive feature was not enabled), but the reexports weren't actually controlled by the derive feature but by the parity-scale-codec-derive feature, and that was transitively enabled by the max-encoded-len feature, so we ended up with the derive macro imported twice: once explicitly with use and once through the parity-scale-codec reexport (where it overlaps with the traits, so importing the trait also automatically imports the macro).

…erflow

src/codec.rs

kpp · 2023-06-13T16:17:57Z

src/codec.rs

+			let ptr: *mut u8 = ptr.cast();
+			let bytesize = core::mem::size_of::<T>().checked_mul(N).expect("array lengths are sane");
+
+			// TODO: This is potentially slow; it'd be better if `Input` supported


I would love to see specialization here for primitive types using [0; N] on the stack:

use num::{Zero, zero}; fn x<T: Zero + Copy, const N: usize>() -> [T; N] { let z = zero::<T>(); [z; N] }

So you don't have to init MaybeUninit::<u8; N> with zeroes manually.

Hm, sorry, I'm not sure I understand what you're suggesting here. What exactly would this change? (: They'd have to be zeroed anyway as long as Input accepts a &[u8] instead of &mut [MaybeUninit<u8>].

They'd have to be zeroed anyway

Yes, but they will be zeroed by compiler and there will be no need for an unsafe code to memzero them.

The previous implementation used that trick for i8/u8, see let mut array: [u8; N] = [0; N];. Still this comment is not a blocker, you can ignore it/implement in another PR.

Okay, I see what you mean now!

Hmm... it which case it'd probably be better to just make it possible to read into the array without initializing it. (:

src/codec.rs

kpp · 2023-06-13T16:36:52Z

src/codec.rs

+					//         and we only drop at most `count` elements here,
+					//         so all of the elements we drop here are valid.
+					unsafe {
+						item.assume_init_drop();


How about https://doc.rust-lang.org/stable/src/core/array/mod.rs.html#900?

That's not stable, but I'll add a TODO comment.

bkchr

Nice implementation, I like :D

Left some minor comments on things that should be improved. Especially the story around DecodeContext and decode_into is not really well explained. Currrently this isn't callable by someone outside of the crate, which is fine, but should be left some reason why.

It would also be nice to create some issue for the assume_init stuff and the best would be to directly link to the Rust issues.

Cargo.toml

bkchr · 2023-06-14T08:10:57Z

src/codec.rs

+		// TODO: This is inefficient; use `Rc::new_uninit` once that's stable.
+		Box::<T>::decode(input).map(|output| output.into())


Could we not just put the code for the Box implementation into some macro and reuse it here? 🙈

Could we not just put the code for the Box implementation into some macro and reuse it here? see_no_evil

That'd be tricky because without ::new_uninit we have no good way to preallocate the uninitialized memory for the Rc/Arc.

In case of a Box<T> the only thing that the memory holds is a T, so we can get its layout and manually allocate the memory.

In case of Rc/Arc there are also the refcounts allocated in the same chunk of memory, and AFAIK there's no good way to get a Layout that could be used to manually preallocate that memory and pass it to Rc::from_raw. One way we could do it would be to just hardcode it (Rc<T> is basically equivalent Box<(usize, usize, T)>), and then verify in a build script that it's correct (in case it suddenly changes, since it's an internal implementation detail), and if it is then do it the fast way, and if it isn't then emit a compile-time warning and do it the slow way. So for all practical purposes that'd work and be safe, but obviously it is very hacky and I don't really like it. And I'm not convinced it's worth it. (As opposed to just waiting waiting until the new_uninit methods are stabilized.)

src/codec.rs

bkchr · 2023-06-14T08:14:44Z

src/codec.rs

+	///
+	/// This is enforced by requiring the implementation to return a [`DecodeFinished`]
+	/// which can only be created by calling [`DecodeContext::assert_decoding_finished`] which is `unsafe`.
+	fn decode_into<I: Input>(ctx: DecodeContext, input: &mut I, dst: &mut MaybeUninit<Self>) -> Result<DecodeFinished, Error> {


Is this really some public api? DecodeContext can not be constructed outside of this crate. If this should not be used by someone external, we should put a doc(hide) above it.

Ahh, you may want people that they can implement it, but not call it themselves?

Ahh, you may want people that they can implement it, but not call it themselves?

It wasn't entirely a deliberate "let's block it", but with no immediate use case in mind I just didn't make it publicly constructible.

Anyhow, it's a moot point now though as I nuked the DecodeContext type anyway. :P

src/codec.rs

Co-authored-by: Bastian Köcher <git@kchr.de>

Co-authored-by: Roman <r.proskuryakoff@gmail.com>

…erflow

koute added 4 commits April 20, 2023 12:01

Rework decoding of Boxes, Rcs, Arcs and arrays

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23
Expired

Verified
Learn about vigilant mode

3de9943

Update changelog

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23
Expired

Verified
Learn about vigilant mode

3ec39d5

Bump version to 3.5.0

7c448c4

Fix typo

0f4aa11

koute requested review from andresilva, kpp, melekes, michalkucharczyk, skunert, bkchr, dmitry-markin, davxy, ggwpez and altonen April 20, 2023 03:25

koute mentioned this pull request Apr 20, 2023

Stack overflow issue in scalecodec while decoding inputs #425

Closed

koute added 6 commits April 20, 2023 12:37

Reduce the size of an array in tests to make miri not run forever

68defc9

Make the TypeInfo match exhaustive

7473b11

Update dependencies

604a1d7

Update benchmarks to criterion 0.4

99f3cdf

Update BitVec benchmarks too

9a95237

Fix the benchmarks for real this time

3cc3cf1

gilescope requested a review from jsdw April 20, 2023 08:36

gilescope reviewed Apr 20, 2023

View reviewed changes

src/codec.rs Outdated Show resolved Hide resolved

Use Layout::new instead of Layout::from_size_align

80b7933

Co-authored-by: Squirrel <giles@parity.io>

ggwpez reviewed Apr 20, 2023

View reviewed changes

tests/mod.rs Show resolved Hide resolved

michalkucharczyk reviewed Apr 21, 2023

View reviewed changes

src/codec.rs Outdated Show resolved Hide resolved

src/codec.rs Outdated Show resolved Hide resolved

src/codec.rs Show resolved Hide resolved

michalkucharczyk previously approved these changes Apr 21, 2023

View reviewed changes

michalkucharczyk self-requested a review April 21, 2023 11:10

Move blanket decoding of wrapper types into a default impl

8abacbc

koute added 4 commits May 8, 2023 15:23

Make max-encoded-len feature not force-enable `parity-scale-codec-d…

bc4bda8

…erive`

Reexport derive macros not when parity-scale-codec-derive is enable…

c2732a3

…d but when `derive` is

Update CHANGELOG again

645d6a6

Fix parity-scale-codec-derive tests compilation

0958c9e

ggwpez reviewed May 8, 2023

View reviewed changes

derive/src/decode.rs Show resolved Hide resolved

koute added 2 commits May 16, 2023 11:33

Merge remote-tracking branch 'origin/master' into master_fix_stack_ov…

4c1db17

…erflow

Update serde

b47acfd

gilescope mentioned this pull request May 17, 2023

Make AssetId and MultiLocation non-Copy; make Instruction 92% smaller paritytech/polkadot#7236

Open

burdges mentioned this pull request May 18, 2023

CanonicalSerde for Rc and Arc arkworks-rs/algebra#649

Closed

yrong mentioned this pull request Jun 6, 2023

Increase test coverage & Fix for goerli setup Snowfork/snowbridge#853

Merged

3 tasks

davxy approved these changes Jun 9, 2023

View reviewed changes

src/codec.rs Outdated Show resolved Hide resolved

src/codec.rs Show resolved Hide resolved

skunert approved these changes Jun 9, 2023

View reviewed changes

michalkucharczyk approved these changes Jun 9, 2023

View reviewed changes

src/codec.rs Outdated Show resolved Hide resolved

kpp approved these changes Jun 13, 2023

View reviewed changes

bkchr approved these changes Jun 14, 2023

View reviewed changes

koute and others added 11 commits June 15, 2023 17:33

Update comment in src/codec.rs

e44bfc6

Co-authored-by: Bastian Köcher <git@kchr.de>

Update comment in src/codec.rs

55e97ee

Co-authored-by: Roman <r.proskuryakoff@gmail.com>

Merge remote-tracking branch 'origin/master' into master_fix_stack_ov…

73f09b2

…erflow

Move the version bounds for parity-scale-codec-derive

edc4e78

Remove DecodeContext

1de33ec

Move the array byte size calculations to compile-time

f50f0e3

Add a TODO about Box::assume_init

75c8549

Move DecodeFinished into its own module

4d4fb7c

Add a TODO about MaybeUninit::slice_assume_init_mut

1822891

Update the changelog

afff672

std -> core so that no_std compiles

1dd636a

koute merged commit 0688c04 into paritytech:master Jun 15, 2023

koute mentioned this pull request Jun 28, 2023

Fix stack overflow when decoding big array newtypes wrapped in a Box #462

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework decoding of `Box`es, `Rc`s, `Arc`s, arrays and enums (stack overflow fix) #426

Rework decoding of `Box`es, `Rc`s, `Arc`s, arrays and enums (stack overflow fix) #426

koute commented Apr 20, 2023 •

edited

Loading

koute commented May 8, 2023

ggwpez left a comment •

edited

Loading

koute commented May 8, 2023

kpp Jun 13, 2023

koute Jun 15, 2023

kpp Jun 15, 2023

koute Jun 15, 2023

kpp Jun 13, 2023

koute Jun 15, 2023

bkchr left a comment

bkchr Jun 14, 2023

koute Jun 15, 2023 •

edited

Loading

bkchr Jun 14, 2023

bkchr Jun 14, 2023

koute Jun 15, 2023

		// TODO: This is inefficient; use `Rc::new_uninit` once that's stable.
		Box::<T>::decode(input).map(\|output\| output.into())

Rework decoding of Boxes, Rcs, Arcs, arrays and enums (stack overflow fix) #426

Rework decoding of Boxes, Rcs, Arcs, arrays and enums (stack overflow fix) #426

Conversation

koute commented Apr 20, 2023 • edited Loading

koute commented May 8, 2023

ggwpez left a comment • edited Loading

Choose a reason for hiding this comment

koute commented May 8, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bkchr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

koute Jun 15, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Rework decoding of `Box`es, `Rc`s, `Arc`s, arrays and enums (stack overflow fix) #426

Rework decoding of `Box`es, `Rc`s, `Arc`s, arrays and enums (stack overflow fix) #426

koute commented Apr 20, 2023 •

edited

Loading

ggwpez left a comment •

edited

Loading

koute Jun 15, 2023 •

edited

Loading