Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code Review: Smart contract VM 01/30/19 #908

Merged
merged 117 commits into from
Feb 7, 2019
Merged

Conversation

kantai
Copy link
Member

@kantai kantai commented Jan 30, 2019

This is the first code review submission for the smart contract VM:

This code implements a basic LISP language, with support for ints, bools, tuples, and lists, for now. Recursion in this language is illegal. There is limited run-time type enforcement -- lists must all be single typed, tuples are strongly typed when used with data-map functions, native functions that expect ints, bools, etc. check their argument types.

At the moment, data-map functions like fetch-entry!, etc., all operate within a global context, set by the eval_all function. You can see test programs in the tests/ directory.

kantai and others added 30 commits January 10, 2019 10:31
…t. use a global context, which will work for our limited lisp dialect
… serializers for Secp256k1PublicKey. Also, update new location of BitcoinNetworkType
… trait to instantiate history rows from individual burn chain operations; implement helpers to insert and select rows of burn chain operations and convert them to and from database rows; implement basic DB tests
…o describe the order in which their column fields should be SELECT'ed
Ok(())
}

fn insert_entry(&mut self, key: Value, value: Value) -> InterpreterResult {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider putting an upper bound on how many bytes an inserted entry can be?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather enforce it at the type level, as in, we should have a maximum byte limit for types, and the database should allow any entry that is a legal type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed re: making Value have a maximum byte size.

binary_comparison(args, &|x, y| x < y)
}

pub fn native_add(args: &[Value]) -> InterpreterResult {
Copy link
Member

@jcnelson jcnelson Feb 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if args has length 0 or 1? Also, will args ever have length > 2? Same question for native_sub, native_mul, and native_div.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(+ 1 2 3) evaluates to 6.
(+) returns 0

For native_sub (-) errors, (- 1) returns the negation of it's single argument, (- 1 2 3) return -4.

Multiply and divide behave similarly ((*) returns 1, (/) is an error)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it 👍

}
}).collect();

let names = coerced_atoms?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the ? operator on its own line? Just curious.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type inference didn't want to work otherwise -- if I move the ? operator up and eliminate the Result<> wrapper:

 --> src/functions/define.rs:31:46
   |
31 |             arguments: arg_names.iter().map(|x| (*x).clone()).collect(),
   |                                              ^ consider giving this closure parameter a type
   |
   = note: type must be known at this point

"Illegal operation: attempted to re-define a value type.".to_string())),
NamedParameter(ref _value) => Err(Error::InvalidArguments(
"Illegal operation: attempted to re-define a named parameter.".to_string())),
List(ref function_signature) => handle_define_function(&function_signature, &elements[2])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the user allowed to define functions that have reserved names?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, it will allow it, though they'll never get called, because lookup_function tries the reserved names first, but it should be an error on define

Err(Error::InvalidArguments("(define ...) requires 2 arguments".to_string()))
} else {
match elements[1] {
Atom(ref variable) => handle_define_variable(variable, &elements[2], env),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the user allowed to define atoms that collide with reserved names?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above -- we need to enforce name legality at define and let -- #914

use super::super::representations::SymbolicExpression::{AtomValue};
use super::super::{Context,Environment,eval,apply,lookup_function};

pub fn list_cons(args: &[Value]) -> InterpreterResult {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an upper bound on how long a list can be?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, no, but I think we should let the "maximum value size" enforce this (as in, if we have a maximum value size of 1000 bytes, then you cannot have a list of longer than 1000/16 integers, e.g.)

fn lookup_variable(name: &str, context: &Context, env: &Environment) -> InterpreterResult {
// first off, are we talking about a constant?
if name.starts_with(char::is_numeric) {
match i128::from_str_radix(name, 10) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to support alternative radixes? Like, say, base-16?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe -- I was planning on lexing hexstrings into buffers, rather than ints. We could support both. Will create an issue for it.

}
}

pub fn lookup_function<'a> (name: &str, env: &Environment)-> Result<CallableType<'a>, Error> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From this code, it looks like the user will not be able to change the runtime behavior of a reserved function by creating a function with the same name. However, I didn't see any code that prevents the user from doing so?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right -- reserved functions are used first, before a user defined function would be. However, we would want to generate a runtime error at the point of the (define) (and also, when the static analyzer is implemented, a static error). I started an issue #914.

*/
pub fn eval_all(expressions: &[SymbolicExpression],
contract_db: Option<Box<database::ContractDatabase>>) -> InterpreterResult {
let db_instance = match contract_db {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels weird to me. I think the caller should always supply a contract DB.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll change that -- I'll move "blank db instantiation" to the execute function, the intent of which is just running a program in a single, transient smart contract context. That's not just for our testing purposes, but as a simple path for a developer wanting to test a script.

result.push(value);
}
},
_ => {
Copy link
Member

@jcnelson jcnelson Feb 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to accept non-printable characters with this match arm? Or should we just reject those as parse errors? I feel like our smart contract VM shouldn't accept non-printable characters (and should probably not use unicode) in order to ensure that the source code is unambiguous and doesn't have any homoglyph attacks or nefarious VT100 control codes embedded within it (which could manifest if you cat'ed the smart contract code to stdout in a terminal, for example).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, definitely should cause parse errors. The next code review will have a lexer that enforces that. It'll be a much more traditional munch-lexer.

Value::Tuple(_a) => Err(Error::InvalidArguments("Cannot construct list of tuple types".to_string())),
_ => {
let mut base_type = TypeSignature::type_of(x);
base_type.dimension += 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a checked_add? What happens if the dimension overflows?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, good catch, thanks

Ok(TypeSignature::new(atom_type, dimension))
}

pub fn parse_type_str(x: &str) -> Result<TypeSignature, Error> {
Copy link
Member

@jcnelson jcnelson Feb 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How long can this string get? Asking because if we're going to split it, we might end up doing O(poly(n)) work for n occurrences of -. We might want to first count up the number of - occurrences first, and if it's greater than 4, error out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't split only ever do O(n) work ?

Copy link
Member

@jcnelson jcnelson Feb 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on the allocator implementation, it could do O(n) malloc()s, which each could take O(log m) time (for m blocks allocated). Not sure what the actual implementation does, but I'm of the school of thought that we should prepare for the worst -- especially since the attacker controls the input.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, yes, avoiding non-constant object instantions is a good idea. Will just switch to using splitn here.

@jcnelson
Copy link
Member

jcnelson commented Feb 6, 2019

This looks really cool! This looks good to me to merge to develop, with one small change: could you put the code and tests under src/core/vm/, instead of blockstack-vm? This way it will be compiled into the blockstack-core binary. Then, I can add a CLI option to blockstack-core to load and run a script using state in an on-disk database (so we can get comfortable testing things out).

@kantai
Copy link
Member Author

kantai commented Feb 6, 2019

Yep, sounds good to me @jcnelson

@codecov-io
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (develop@85500cc). Click here to learn what that means.
The diff coverage is 61.51%.

Impacted file tree graph

@@            Coverage Diff             @@
##             develop     #908   +/-   ##
==========================================
  Coverage           ?   64.78%           
==========================================
  Files              ?       44           
  Lines              ?     5637           
  Branches           ?        0           
==========================================
  Hits               ?     3652           
  Misses             ?     1985           
  Partials           ?        0
Impacted Files Coverage Δ
src/util/macros.rs 67% <ø> (ø)
src/chainstate/burn/mod.rs 0% <0%> (ø)
src/burnchains/bitcoin/network.rs 0% <0%> (ø)
src/burnchains/bitcoin/indexer.rs 0% <0%> (ø)
src/core/mod.rs 0% <0%> (ø)
src/burnchains/bitcoin/spv.rs 0% <0%> (ø)
src/util/mod.rs 0% <0%> (ø)
src/main.rs 2.63% <0%> (ø)
src/burnchains/bitcoin/rpc.rs 0% <0%> (ø)
src/chainstate/burn/operations/mod.rs 0% <0%> (ø)
... and 33 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 85500cc...f5a019c. Read the comment docs.

@kantai
Copy link
Member Author

kantai commented Feb 7, 2019

Going forward, can we commit to staying on the stable rust compilers? It's hard for me to keep up with nightly, and I don't think it's a good idea to rely on unstable features in general.

I ripped checked_pow's implementation over from rust's code (https://github.com/milesand/rust/blob/master/src/libcore/num/mod.rs#L857), so we won't need to use nightly anymore. That's actually prepared for 1.34 (not 1.33), which means it's two releases away, which seems like too long for us to be using nightly/beta. Will mark an issue for us to remember to remove the code once 1.34 releases.

@kantai
Copy link
Member Author

kantai commented Feb 7, 2019

Cool -- I'm going to merge this. I think this is also going to merge PR #910.

@kantai kantai merged commit a74635a into develop Feb 7, 2019
@kantai kantai deleted the review/smart-contract-013019 branch January 27, 2021 23:12
@blockstack-devops
Copy link
Contributor

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@stacks-network stacks-network locked as resolved and limited conversation to collaborators Nov 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants