Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

revive: Limit the amount of static memory a contract can use #5726

Merged
merged 15 commits into from
Sep 18, 2024
Merged

Conversation

athei
Copy link
Member

@athei athei commented Sep 16, 2024

This will make sure that when uploading new code that the declared static memory fits within a defined limit. We apply different limits to code and data. Reason is that code will consume much more memory per byte once decoded during lazy execution.

This PR:

  1. Remove the MaxCodeLen from the Config to we maintain tight control over it.
  2. Defines a single STATIC_MEMORY_BYTES knob that limits the maximum decoded size.
  3. Enforces them only on upload but not on execution so we can raise them later.
  4. Adapt the worst case calculation in integrity_check.
  5. Bumps the max stack depth from 5 to 10 as this will still fit within our memory envelope.
  6. The memory limit per contract is now a cool 1MB that can be spent on data or code.
  7. Bump PolkaVM for good measure
  8. The blob is limited to 256kb which is just a sanity check to not even try parsing very big inputs.

@athei athei requested a review from koute September 16, 2024 13:30
@athei athei added the R0-silent Changes should not be mentioned in any release notes label Sep 16, 2024
@athei athei requested review from pgherveou and xermicus September 16, 2024 17:22
substrate/frame/revive/src/limits.rs Outdated Show resolved Hide resolved
substrate/frame/revive/src/limits.rs Outdated Show resolved Hide resolved
@athei athei requested a review from xermicus September 17, 2024 15:54
Copy link
Member

@xermicus xermicus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

pub fn enforce<T: Config>(blob: Vec<u8>) -> Result<CodeVec, DispatchError> {
fn round_page(n: u64) -> u64 {
debug_assert!(
PAGE_SIZE != 0 && (PAGE_SIZE & (PAGE_SIZE - 1)) == 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, you have the next_multiple_of method on primitives you can use instead of this helper.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh yes. This function panics on overflow, though. Hence, I forced the inputs to be u32 and do the rounding in u64. When we switch to 64bit we probably just error out when the section sizes don't fit into u32.

// plus the overhead of instructions in memory which is derived from the code
// size itself and the number of instruction
let memory_size = (blob.len() as u64)
.saturating_add(round_page(program.ro_data_size() as u64))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. For all of the u32 -> u64 casts you could use u64::from, which I think is nicer as it will give you an error if this cast ever becomes truncating (in PolkaVM I have a clippy lint set up for this).
  2. Alternatively, instead of the saturating_* spam you could use core::num::Saturating wrapper plus a local trait to make it nicer, but this might be a complete overkill.
trait Cast {
    fn cast(self) -> Saturating<u64>;
}
impl Cast for u32 {
    fn cast(self) -> Saturating<u64> { return Saturating(u64::from(self)) }
}
impl Cast for usize {
    fn cast(self) -> Saturating<u64> { return Saturating(self as u64) }
}

let memory_size = blob.len().cast()
    + program.ro_data_size().cast()
    - program.ro_data.len().cast()
    + ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Yes to the first point. I replaced by into where possible. The code then breaks when we upgrade to 64bit and we need to re-evaluate.
  2. Looking at the saturating_* spam gives me a piece of mind. Seeing raw arithmetic is kinda triggering :)

@@ -56,7 +57,7 @@ use sp_runtime::DispatchError;
#[codec(mel_bound())]
#[scale_info(skip_type_params(T))]
pub struct WasmBlob<T: Config> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the WasmBlob be renamed? (: (Not necessarily in this PR of course.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes there is a lot of renaming to do. I am pushing this of because this will be a nasty PR.

@@ -58,3 +61,85 @@ pub const STORAGE_KEY_BYTES: u32 = 128;
///
/// The buffer will always be disabled for on-chain execution.
pub const DEBUG_BUFFER_BYTES: u32 = 2 * 1024 * 1024;

/// The page size in which PolkaVM should allocate memory chunks.
pub const PAGE_SIZE: u32 = 4 * 1024;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just use VM_MIN_PAGE_SIZE from polkavm-common?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That constant's not for public consumption. (: (In general everything is 'polkavm-common' is private)

Besides, what matters is the page size that's configured, not what the minimum supported is, so this way is how it is actually intended to be used.

.saturating_sub(MAX_STACK_SIZE)
.saturating_div(17 * 4);
.saturating_sub(STATIC_MEMORY_BYTES)
.saturating_div(4);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.saturating_div(4);
.saturating_div(EXTRA_OVERHEAD_PER_CODE_BYTE);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to account for memory allocator overhead (see the docs above). But it's a good point: All those numbers should have names.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced by constants

@athei athei enabled auto-merge September 18, 2024 14:25
@athei athei added this pull request to the merge queue Sep 18, 2024
Merged via the queue into master with commit 310ef5c Sep 18, 2024
206 of 209 checks passed
@athei athei deleted the at/limit_mem branch September 18, 2024 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
R0-silent Changes should not be mentioned in any release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants