-
Notifications
You must be signed in to change notification settings - Fork 624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: update cost estimator with contract compilation cost computations #4231
Conversation
olonho
commented
Apr 16, 2021
•
edited
Loading
edited
- Implemented contract precompilation cost computation framework
- Use contract precompilation measurements in cost computations
|
Above costs are with Wasmer1 and precompilation, now recomputing the baseline with Wasmer0 and no precompilation on the same data/revision. |
|
Full difference between two (first - before Wasmer 1.0, second - after):
Significant difference (>15%) is the following:
|
// File 341191, code 279965, data 56627. | ||
let data = include_bytes!("../test-contract/res/lockup_contract.wasm"); | ||
let contract = ContractCode::new(data.to_vec(), None); | ||
xs.push(data.len() as u64); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename data
to bytes
or file
to avoid confusion with "data section" mentioned in comment?
|
||
// We multiply `b` by 5/4 to accommodate for the fact that test contracts are typically 80% code, | ||
// so in the worst case it could grow to 100% and our costs are still properly estimate. | ||
// (a, b * Ratio::new(5i128, 4i128)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think above math formula is correct, however, it doesn't make sense to do a lsm on varied portion of code section / size.
Assume you test with
x1 = file size of file1.wasm, which has 90% of code section, y1 = cost
x2 = file size of file2.wasm, which has 60% of code section, y2 = cost
...
Then what is does the y = mx + b
means? it would give both errorneous estimation on both 90%-code and 60% code contract, and the est doesn't give you an upper bound. So i suggest do two set of least square method:
x1 = code size of file1.wasm, y = cost1
x2 = code size of file2.wasm, y = cost2
...
You get a cost = m1 * code_size + b1
Then do least square on: xi = code size of file_i.wasm, y = cost_i, and get cost = m2 * data_size + b2.
Then, cost = max(m1,m2)*file_size + max(b1,b2) should give an upper bound cost, no matter how many portion of the file is code or data
Computed costs in pure instructions for Wasmer0 and Wasmer1:
|
Cost in gas will be thus (*1M / 8)
It is pretty high, as accounts for both compilation and storage. |
When replaced backing cache with no-op, results are following:
|
xs.push(data.len() as u64); | ||
ys.push(measure_contract(vm_kind, gas_metric, &contract, cache)); | ||
|
||
// Least squares method. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, I did exactly this couple of weeks ago: https://github.com/rust-analyzer/rust-analyzer/pull/8384/files#diff-e55167e9a98e1977d285bd95def5fe1bb0cc43749ab559f854900f69b1b7cb42R312 🤣
// Compute error estimation. | ||
let mut error = 0i128; | ||
for i in 0..n { | ||
let expect = (a + b * (xs[i] as i128)).to_integer(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not quite correct -- we just forget about fractional part, but it obviously affects the error. What is the reason for doing calculations in intergers? I think linear regression fundamentally operates with fractions, so we either should do all the math in rationals, or in floats.
I can see merit in sticking to precise rationals in principle, but I think floats wold work better in practice in this context? At lest the error calculation should be done in floats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking more about this, rationals do make sense here, error calculation is the only place where we lose precision, so it's probably better to change just this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error vector is computed in integers.
let diff = expect - (ys[i] as i128); | ||
error = error + diff * diff; | ||
} | ||
println!("Error {}", (error as f64).sqrt() / (n as f64)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I think this should be (error as f64 / n as f64).sqrt()
. RMSE is a root of MSE.
let n = xs.len(); | ||
let n128 = n as i128; | ||
|
||
let mut sum_prod = 0 as i128; // Sum of x * y. | ||
for i in 0..n { | ||
sum_prod = sum_prod + (xs[i] as i128) * (ys[i] as i128); | ||
} | ||
|
||
let mut sum_x = 0 as i128; // Sum of x. | ||
for i in 0..n { | ||
sum_x = sum_x + (xs[i] as i128); | ||
} | ||
|
||
let mut sum_y = 0 as i128; // Sum of y. | ||
for i in 0..n { | ||
sum_y = sum_y + (ys[i] as i128); | ||
} | ||
|
||
let mut sum_x_square = 0 as i128; // Sum of x^2. | ||
for i in 0..n { | ||
sum_x_square = sum_x_square + (xs[i] as i128) * (xs[i] as i128); | ||
} | ||
|
||
let b = Ratio::new(n128 * sum_prod - sum_x * sum_y, n128 * sum_x_square - sum_x * sum_x); | ||
let a = Ratio::new(sum_y * b.denom() - b.numer() * sum_x, n128 * b.denom()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use a differently stated formula in my version, and I pen&paper confiremed that they are equivalent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this one denom is multiplied to both
let contract = ContractCode::new(data.to_vec(), None); | ||
xs.push(data.len() as u64); | ||
ys.push(measure_contract(vm_kind, gas_metric, &contract, cache)); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: i'd split out fn least_squares(xs: &[], ys: &[]) -> (Ratio, Ratio)
to keep pure math independent of wasm stuff. This will also allow us to write a unit-test that, if least_squeares
is feed with a linear function, it exactly recovers coefficients with zero error. The math looks right, but its fiddly, a test would be a nice assuareance.
ys.push(measure_contract(vm_kind, gas_metric, &contract, cache)); | ||
|
||
// File 257516, code 203545, data 50419. | ||
let data = include_bytes!("../test-contract/res/staking_pool.wasm"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it the right approach to use "real world" contracts for estimation? My gut feeling is that it probably is not.
We should be estimating worst case, so we should prepare a synthetic contract that exercises this case. We should use real contracts to double-check that a synthetic worst-case is at least not better that real world code.
However, just using real world contracts feels like we are saying "look, we don't know what the worst-case is, but we assume that it's within a stone's throw of typical case, so let's just measure that".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think max(m1,m2) approach will give you upper bound even for unknown code-portion contract, proof is:
let cost = max(m1,m2)*file_size+max(b1,b2)
then cost>=m1*filesize+b1>m1*codesize+b1=cost est based on codesize
cost>=m2*filesize+b2>m2*datasize=cost est based on datasize
Looks like the compile base cost of wasmer1 significantly increased is a problem |
Oh, we we running non-optimized builds, tested with optimized versions (empty cache):
Which translates to
|
With storage cache:
which transforms to
|
@olonho does the new function call per byte cost include deserialization cost? It seems too cheap |
Updated tests with validation logic ensuring that other contracts can be properly upper cost estimated by values obtained from measurements on other contracts. Also run the actual cost estimation:
So the costs we suggest to be:
|
|
@ailisp @matklad @olonho do these numbers mean that Wasmer 1 is not faster than Wasmer 0? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is my understanding correct that this PR does not actually change the cost?
@nearmax Yes, but it's important to understand what this cost is about. In a sense, a comparison between wasm versions is misleading in this case. This is the cost for compiling the contract. At the moment, we don't have this cost at all. So the actual diff here is going from not charging this cost at all to starting charging it. Compilation cost being the same for 1 and 0 being similar makes sense. |
Updated costs table after rebase:
|
@olonho function call execution cost is 100Ggas, which means we can execute 10000 no-op function calls per block. I thought reloading compiled contracts and deserialization would not allow us to do that many function calls. |
Difference is now
|
@olonho The reason why I have doubts about this is that according to @ailisp it would take us a couple milliseconds to load and deserialize the compiled contracts, which means that we cannot fit 10k noop calls in one block. |
@olonho also #4231 (comment) looks suspicious because the current function call cost is
which is an order of magnitude larger than the one you posted (wasmer0_noprecomp). |
OK, as it's rather important, will create dedicated code similar to what we have for contract precomilation costs and report back the results. |
Implemented (ddb4dbd) very explicit test allowing to measure time and icount metrics on many small distinct contracts. Results are as following:
I.e when precompiled, invocation cost for the very small contract is 0.097 "Tgs", if counted as time on my machine, and 0.037Tg, i.e. 3x smaller when counted as instructions compared to counted as time. When measuring 10K contracts following IO statistics applies:
i.e we read about 8M of data. |
What's the reason for measuring only small contracts? In practice most of the deployed contracts are pretty large. |
Fully makes sense, we can use larger contracts as well, it was just easier to write with smaller contract. |
Also counted instruction counts and wall clock with independent tool, Linux
|
Our current cost model assumes CPU capable of executing 8 instructions per nanosecond, experiments with perf shows that on high end modern CPUs (Intel 4.2Ghz) it's about 10.2 instructions per nanosecond
On lower end CPUs we could probably recalibrate our gas multiplier to assume 2-3 instructions per-nanosecond by decreasing divisor in |
Cloud instances we and validators use to run mainnet node are usually low end (google cloud, aws gives you N vcpus, so they optimized for N by using majorly 5-10 yr old cpus :) ) Did you run the ins/nanosecond check on our cloud instances? |
Per the very end of table in https://en.wikipedia.org/wiki/Instructions_per_second one could see
Nope, actually just running perf stat on some test, i.e one of contract call tests above shall be sufficient. Could youplease help with that? |
@@ -2434,6 +2434,7 @@ impl<'a> VMLogic<'a> { | |||
} | |||
} | |||
|
|||
// TODO: remove, as those costs are incorrectly computed, and we shall account it on deployment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we create an issue (or link to an existing one) for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docker> cargo run --release --package neard --features protocol_feature_evm,nightly_protocol_features --bin neard -- --home /tmp/data init --chain-id= --test-seed=alice.near --account-id=test.near --fast | ||
docker> cargo run --release --package genesis-populate --bin genesis-populate -- --additional-accounts-num=200000 --home /tmp/data | ||
docker> cargo run -j2 --release --package neard --features protocol_feature_evm --bin neard -- --home /tmp/data init --chain-id= --test-seed=alice.near --account-id=test.near --fast | ||
docker> cargo run -j2 --release --package genesis-populate --features protocol_feature_evm --bin genesis-populate -- --additional-accounts-num=200000 --home /tmp/data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need protocol_feature_evm
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do estimate EVM costs, and it was there before.
LGTM, any specific reason behind 5/4 constant? |
In contracts above code takes 80% of Wasm size, so multiply for cases where 100% of contract is code. |