Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Seal: Scratch away the scratch buffer (Part 1) #5513

Closed
Robbepop opened this issue Apr 3, 2020 · 6 comments · Fixed by #6573
Closed

Seal: Scratch away the scratch buffer (Part 1) #5513

Robbepop opened this issue Apr 3, 2020 · 6 comments · Fixed by #6573
Assignees
Labels
I7-refactor Code needs refactoring.

Comments

@Robbepop
Copy link
Contributor

Robbepop commented Apr 3, 2020

Background

This issue introduces an incremental redesign of parts of the interface between the contracts pallet and WebAssembly smart contracts.

A similar discussion has already been started some time ago but in the end was not targeted enough and not chunked up into actionable work items. This issue is the beginning of a potential series of issues that chunk up the massive work needed to get the interface into an efficient and nice shape.

Motivation

The current problem with certain parts of the interface is two fold.

  1. Since we are operating between boundaries of the WebAssembly sandbox and the native host runtime calls between those are generally regarded as expensive. This leads to the design goal of an interface that reduces the number of calls between client (contract) and host (contracts pallet).
  2. The current interface has explicit notions of a so-called scratch buffer. This is a linear memory buffer on the contracts pallet side and mostly acts as a cache in order to make certain common operations less expensive by memoization. Also it literally is a scratch buffer to store intermediate compute information.
    • The problem some people are having with this approach is that the scratch buffer is primarily an implementation detail since a similar interface could be implement with multiple scratch buffer (or registers) as shown by the NEAR protocol interface or arguably without the explicit notion of a scratch buffer as represented in this issue in the following.
    • Besides this another bigger interface problem with an explicit scratch buffer is that the interface constraints itself to shared mutable state between different interface methods leading to implicit behavior that might lead to confusion and arcane bugs and behavior.
  3. While introducing the ext_ hashes we have already taken an approach of avoiding the scratch buffer in their interfaces. So going down this path for other APIs does not introduce another unknown API style.

Example

Current Interface

Let us look at an example of the current API, line out the problems with it and propose another interface that fixes those outlined problems with the same example.

We are going to take a look at a common invocation of the ext_caller API.
This extracts information about the AccountId of the caller of the currently running smart contract execution.
The procedure is as follows:

  1. Call ext_caller to make the host side put the information into the scratch buffer.
  2. Query the scratch buffer size (in bytes) via ext_scratch_size.
  3. Finally read the bytes from the scratch buffer into contract memory via ext_scratch_read.

In code this looks similar to this:

let mut buffer = <Vec<u8>>::new();
ext_caller();
let req_len = ext_scratch_size();
buffer.resize(req_len, 0x00);
ext_scratch_read(buffer.as_ptr(), req_len);

This incorporate 3 mandatory calls between contract and pallet.
Also it has the huge disadvantage of the shared mutable state on the pallet side which is hidden from the contract and would cause trouble if there was a call to another ext_ function somewhere in between for example.

Proposed Interface

This issue proposes to change the API to no longer require 3 calls but instead a single call between contract and pallet. The API for ext_caller looks like the following:

fn ext_caller(output_ptr: u32, output_len: u32, output_req_len: u32) -> u32;

Where (output_ptr, output_len) represents a contract-side buffer where the result of ext_caller is to be stored if output_len suffices the requirements and output_req_len is an out-parameter to be set by the pallet that tells the contract side how many bytes have actually been used by output since output_len could in theory be equal to or greater than the actually required bytes.
Returns 0 upon success or 1 if output_len is too small to hold the result.
In this case the contract side needs to handle the error e.g. by increasing the buffer size and do the call again. This second call is obviously "bad" however, given that entities such as ink! or Solang are aware of the issues this case would potentially never really happen since the output buffer would simply be big enough for all use cases. So the API optimizes for the happy path.

Let us look how the call to this new ext_caller API would look like:

// We simply instantiate a big-enough buffer up front.
// This is how ink! already handles all calls because it allows to completely
// drop dynamic memory allocation from the contracts runtime and saves
// tons of bytes per contract.
// For example ink! in reality uses a one-size-fits all 16kB buffer.
let mut output = [0x00_u8; 100];
let mut req_len = 0;
if ext_caller(output.as_ptr(), output.len(), &mut req_len) != 0 {
     // Handle very uncommon error: for example trap or try with bigger buffer.
     // Since it is so unlikely to happen, ink! would simply bail out (trap).
}

Note that in the worst case exactly two calls to ext_caller are issued: If the first call provided an output buffer that was too small the contract still receives the required length of the output buffer via the req_len output-parameter back and can adjust its output buffer size.

Alternative Design

A slight alteration of the proposed design is to merge output_len input parameter and output_req_len output parameter:

fn ext_caller(output_ptr: u32, output_len_ptr: u32) -> u32;

This has the advantage of getting rid of a single parameter but has the downside of making the calling process a bit harder and callers have to take extra care not to accidentally mutate state of their internal data structures.

Proposal

This issue proposes to apply the above presented API change not only to ext_caller but to all property-like APIs currently existing in the contracts pallet interface for smart contracts.
This includes but is not limited to:

  • ext_caller
  • ext_address
  • ext_gas_price
  • ext_gas_left
  • ext_balance
  • ext_value_transferred
  • ext_now
  • ext_minimum_balance
  • ext_tombstone_deposit
  • ext_rent_allowance
  • ext_block_number

Future Work

In the future we are trying to apply scratch buffer avoiding interfaces to other APIs in the contracts pallet. We could do this for the majority or even all APIs. However, we have to solve the problem of potentially expensive calls such as ext_call that might explode in costs (gas costs) if called multiple times in case the given output buffer was too small.

The ultimate goal is to completely get rid of the explicit notion of a scratch buffer in the interfaces. However, due to open questions as the one stated above we opted in to make these changes incremental instead of bulk.

@seanyoung
Copy link

Totally makes sense. Very nice.

@athei
Copy link
Member

athei commented Apr 3, 2020

I am in strong favor of this proposal. The scratch buffer abstraction is neither easy to use nor does it provide performance benefits in those cases. It is the worst of both worlds.

In this spirit I am in favor of the approach with the in/out argument which saves one parameter. At this level it should not be about what is most easy to use but what is the simplest solution that maintains minimal overhead. In this sense the additional argument is redundant in that it only provides some protection against footgunning. This should be provided by a layer on top this API (ink!, solang).

@seanyoung
Copy link

When ext_instantiate succeeds, the address is returned in the scratch buffer. The address could be returned in a buffer pointed to in the ext_instantiate call.

This also reduces the code and calls needed.

@Robbepop
Copy link
Contributor Author

Robbepop commented May 12, 2020

When ext_instantiate succeeds, the address is returned in the scratch buffer. The address could be returned in a buffer pointed to in the ext_instantiate call.

This also reduces the code and calls needed.

The problem with calls like ext_instantiate is that if the buffer was too small and the contracts pallet thus wasn't able to write the result account ID into that buffer the caller has to do yet another costly instantiation call. In the other case where the instantiating contract silently ignores that the buffer was too small and finishes execution: How would the contracts pallet react to that? Would we have a dangling contract?

While I believe that we can also find solutions for costly calls such as ext_instantiate or ext_call their complicated nature and semantics makes design decisions around them for this topic a bit more complicated.

However, I think we should first get this part 1 of "scratching away the scratch buffer" done and then collect thoughts and ideas in a part 2.

@seanyoung
Copy link

You're assuming that account ID can be of a non-fixed length type? Is this already the case, or is this for future-proofing the interface?

@Robbepop
Copy link
Contributor Author

You're assuming that account ID can be of a non-fixed length type? Is this already the case, or is this for future-proofing the interface?

Yes, as with other EnvTypes the AccountId type can be specified by the custom chain to be whatever they want it to be as long as it maintains certain ground rules. But the encoding/decoding size definitely can vary.

@athei athei self-assigned this Jun 10, 2020
@athei athei added the I7-refactor Code needs refactoring. label Jun 12, 2020
@athei athei mentioned this issue Jun 15, 2020
3 tasks
@athei athei closed this as completed Jun 15, 2020
@athei athei reopened this Jun 15, 2020
@athei athei changed the title Contracts: Scratch away the scratch buffer (Part 1) SEAL: Scratch away the scratch buffer (Part 1) Jun 16, 2020
@athei athei changed the title SEAL: Scratch away the scratch buffer (Part 1) Seal: Scratch away the scratch buffer (Part 1) Jun 16, 2020
@ghost ghost closed this as completed in #6573 Jul 9, 2020
@athei athei moved this to Done in Smart Contracts Jul 27, 2022
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
I7-refactor Code needs refactoring.
Projects
Status: No status
Development

Successfully merging a pull request may close this issue.

4 participants