Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow default parameters #903

Closed
fulldecent opened this issue Jun 17, 2018 · 22 comments
Closed

Allow default parameters #903

fulldecent opened this issue Jun 17, 2018 · 22 comments
Labels
VIP: Approved VIP Approved VIP: Discussion Used to denote VIPs and more complex issues that are waiting discussion in a meeting

Comments

@fulldecent
Copy link
Contributor

fulldecent commented Jun 17, 2018

This is a reduced scope feature as compared to adding Function Overloading #884.

Introduction

ERC-721 includes two functions with the same name but one includes an extra parameter. The ERC specification states that the less parameter one will function equivalently as if the more parameter one were called with a specific value.

Specification

Allow default parameters at function declaration.

Here is a very relevant example. Includes NatSpec documentation as per #898.

Sorry, not sure if my bytes declaration is correct.

## 
# @notice Transfers the ownership of an NFT from one address to another address
# @dev Throws unless `msg.sender` is the current owner, an authorized
#  operator, or the approved address for this NFT. Throws if `_from` is
#  not the current owner. Throws if `_to` is the zero address. Throws if
#  `_tokenId` is not a valid NFT. When transfer is complete, this function
#  checks if `_to` is a smart contract (code size > 0). If so, it calls
#  `onERC721Received` on `_to` and throws if the return value is not
#  `bytes4(keccak256("onERC721Received(address,uint256,bytes)"))`.
# @param _from The current owner of the NFT
# @param _to The new owner
# @param _tokenId The NFT to transfer
# @param data Additional data with no specified format, sent in call to `_to`
@public
@payable
def safeTransferFrom(from: address, to: address, token_id: uint256, data: bytes = ""):
    ...

The Vyper compiler will process this as two separate function selectors. Actually (1 + number of default parameters) function selectors. This follows the Python convention of allowing parameters to be omitted if a default parameter is specified. This differs from Python in that Python function arguments can be specified out of order.

The Vyper compiler can optimize byte code more efficiently than Solidity currently does because Vyper will have more knowledge about the relationship between safeTransferFrom(a,b,c,d) and safeTransferFrom(a,b,c).

Literature review

Vyper has expressed opinions on function overloading in general:

  • This can cause lots of confusion on which function is called at any given time.
  • It makes the code much harder to search through as you have to keep track on which call refers to which function.
  • Thus it's easier to write missleading code (foo("hello") logs "hello" but foo("hello", "world") steals your funds).

Strictly speaking, a default parameter is a function overload. From the EWM there are two function descriptors, but from Vyper, there is one function. I believe the opinions above apply to allowing arbitrary function overloads, whereas this issue is only a very specific kind of overload.

With this proposed feature, we centralize implementation for both code paths in one function. This solves the issue of tracking which call refers to which function, they are the same function. Also this implementation is much safer than Solidity. In Solidity, as this can cause confusion. The foo() logs but foo(param) steals funds example above does not apply because BOTH are the same function.

Discussion

Honestly I don't feel like this is a compromise. I believe this actively promotes Vyper's goal to improve human-readability of code and prevent misleading code.

Pinging people from function overloading discussion: @maurelian @fubuloubu @jacqueswww @ben-kaufman

@fubuloubu
Copy link
Member

def overloaded_function(a: uint256, b: bytes32=""):
    ...

Creates sha3('overloaded_function(uint256)') which calls sha3('overloaded_function(uint256,bytes32)') with "" as a second arg. Avoids having to define overloads and the possibility of an overloaded function having different functionality.

I like it! Very pythonic solution to the problem!

@fubuloubu
Copy link
Member

Can also specify non-zero default args

@jacqueswww
Copy link
Contributor

Yes, at first glance I like it too. :)

This is not my final opinion, let me mull this over a bit some more (weekend brain).

@jacqueswww jacqueswww mentioned this issue Jun 18, 2018
8 tasks
@jacqueswww jacqueswww added VIP: Discussion Used to denote VIPs and more complex issues that are waiting discussion in a meeting VIP: Approved VIP Approved labels Jun 18, 2018
@jacqueswww
Copy link
Contributor

Marked as Approved.

@pipermerriam
Copy link
Contributor

The Vyper compiler will process this as two separate function selectors. Actually (1 + number of default parameters) function selectors.

I think this is not accurate if we assume keyword arguments are in play. It will be (n * (n + 1)) / 2 additional function selectors. For example,

def overloaded_fn(a: uint256=3, b: uint256=4):
    ...

This would be equivalent to the following three functions f(n) => (2 * (2 + 1)) / 2 => 3

def overloaded_fn(1):  # a=1, b=4
def overloaded_fn(b=1):  # a=3, b=1
def overloaded_fn():  # a=3, b=4
    ...

@jacqueswww
Copy link
Contributor

jacqueswww commented Jun 20, 2018

Yes that is correct, my idea would also be not to allow more than 2 default function parameters (the one I picked), As anything more than 2 would just produce too many combinations. Will think if this is even necessary.

@fubuloubu
Copy link
Member

Restrict to a maximum of 3 default arguments? (6 generated function signatures)

@pipermerriam
Copy link
Contributor

My suggestion which I think was generally agreed upon in chat

  • No upper limit
  • Issue a warning for argument counts >=N (default 7?)
  • clearly document that high numbers of default arguments will cause an exponential size increase in compiled bytecode.

@maurelian
Copy link
Contributor

Out of curiosity, is there any sense of the cost in terms of runtime bytecode for each new selector? Is it only the code required to dispatch and JUMP to the corresponding code segment, or will the bytecode for the function body also need to be duplicated?

@fubuloubu
Copy link
Member

The way we were discussing it, it would be a dispatch statement. So basically, load the stack with the call args and/or defaults, then jump to the code.

The problem is the exponential growth (7 default args produces 28 different calling loaders). At some point it becomes untenable.

@fulldecent
Copy link
Contributor Author

7 default args is 2^7 = 128 loaders if you allow arbitrary order. If you only allow the right-most arguments to be omitted then that is 7+1 = 8 loaders. Both solutions are probably fine because it is the programmer's damn fault if they are designing an external function that way!

Bytecode for the function body does not need to be duplicated.

Also if you are only allowing right-most arguments to be omitted then this allows you to implement in an iterative fashion:

callFunction(1,2,3, defaultA, defaultB) encodes like this:

...
PUSH1
PUSH2
PUSH3
JUMP: callFunction_with_2_defaults:

callFunction_with_2_defaults:
PUSH defaultA
JUMP callFunction_with_1_default:

callFunction_with_1_default:
PUSH defaultB
JUMP callFunction:

callFunction:
Now you have 1,2,3,A,B on stack, ready to go
...

@fubuloubu
Copy link
Member

fubuloubu commented Jun 21, 2018

(7 * (7 + 1)) / 2 = 28 is the right number, because order is important but you don't have to necessarily require right to left ordering of default value settings (e.g. we don't need to disallow myFunction(a=1, b, c=3) if we don't want to)

Regardless, jumping to each signature and loading the corresponding default is a good implemetation suggestion if we require right to left ordering of defaults. I would suggest this be the final implemetation when we have figured out the whole internal call thing.

EDIT: I think you're right, N + 1 is the real answer to the growth of the signature stack. So it's linear, not exponential. We probably don't even need a special case for this, I don't think anyone would legitimately use more than 7 default args

@jacqueswww
Copy link
Contributor

Starting on this.

@jacqueswww
Copy link
Contributor

jacqueswww commented Aug 3, 2018

Observation I have made so far.

Because of the way that method_id gets generated we can not allow two default arguments of the same type, as the sha3 method signature would be the same. I am handling this case, but just something to note.

image

@fulldecent
Copy link
Contributor Author

fulldecent commented Aug 7, 2018

Sure you can! Anonymous default parameters can only be the rightmost parameters. So

fun(int a, int b=5, int c=6) is unambiguous when you call fun(7, 2).

@fubuloubu
Copy link
Member

so, by that logic:
fun(7, 2) (analogous to fun(7, b=2)) calls keccak('fun(int,int)') and fun(7, c=2) calls keccak('fun(int,int)')?

@fulldecent
Copy link
Contributor Author

fulldecent commented Aug 7, 2018

fun(7,2) calls keccak('fun(int256,int256)') and fun(7) calls keccak('fun(int256)').

Three external entry points total.

@fubuloubu
Copy link
Member

fubuloubu commented Aug 7, 2018

So fun(7, c=2) is disallowed? If you want to call with arg c, you would have to fully specify e.g. fun(7, 5, 2)

EDIT: This is where I got the fall-through idea from @jacqueswww

aka this is the reason why the number of function selectors is n + 1 and not (n * (n + 1)) / 2 (where n is the number of args that have defaults)

@fulldecent
Copy link
Contributor Author

Ah. For internal it does not matter. There is no function selector for internal calls.

@jacqueswww
Copy link
Contributor

jacqueswww commented Aug 9, 2018

@fulldecent there still needs to be some unique identifier to jump to though ;) (and until #901 gets implemented still needs the method_id)

So I have implemented the necessary changes, and it's looking good, at this stage it still uses the fun(7, c=3) style, because based on the comments when I started this was the agreed upon approach ie. (2^n function signature variations).

If everyone one is OK with moving to positional variations (n+1) only. I will make the changes.
My vote goes to positional args only, because it make the number of possible variations much less (:+1: for yes, :-1: for no).

Basically:

fun(a: int128, b: int128=1, c: int128=2) will only have the signatures:

fun(int128)  # a = ?, b = 1, c = 2
fun(int128, int128)  # a = ?, b = ?, c = 2
fun(int128, int128, int128)  # a = ?, b = ?, c = ?

Exposed on the ABI.

@fulldecent
Copy link
Contributor Author

The unique identifier is the JUMPDST instruction byte offset!


But yes, I fully agree with your proposal here.

@jacqueswww
Copy link
Contributor

Yes but in the IR, you'd still generate a unique label/symbol that's all I meant ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
VIP: Approved VIP Approved VIP: Discussion Used to denote VIPs and more complex issues that are waiting discussion in a meeting
Projects
None yet
Development

No branches or pull requests

5 participants