Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block gas limit calibration #1273

Merged
merged 9 commits into from
Aug 4, 2023
Merged

Block gas limit calibration #1273

merged 9 commits into from
Aug 4, 2023

Conversation

xgreenx
Copy link
Collaborator

@xgreenx xgreenx commented Aug 2, 2023

  • Added a new benchmark, block_target_gas, that runs different simple scripts with opcode and measures the block execution time. It allows us to track how accurate is our gas cost. If each benchmark takes ~ the same amount of time, then our gas cost is accurate. The current costs is not accurate because the noop takes ~1s for 100M gas, while meq - ~6s, logd - ~23s... Later we can have more opcodes tested and the end goal is to optimize opcodes/calibrate benchmarks to have the same execution time.
  • Modified the formula to calculate the dependent cost. Instead of the predicted value, we use the value from the worst scenario. But we also need to consider that the cost of most opcodes rises logarithmically. So, there is a point after which each new element is relatively cheap. And if the cost includes cases with a small number of elements as a base cost, we can make the cost per element much cheaper. So, the initial benchmarks with 10, 1000, and 10000 elements can be included in the base cost to decrease the cost for each new element. The predicted value from linear regression is used to find this point.

@xgreenx xgreenx force-pushed the feature/gas-calibration branch from 7117efc to a11a780 Compare August 3, 2023 11:54
@xgreenx xgreenx self-assigned this Aug 3, 2023
@xgreenx xgreenx requested a review from a team August 3, 2023 22:11
@xgreenx xgreenx marked this pull request as ready for review August 3, 2023 22:12
@xgreenx xgreenx requested a review from Voxelot August 3, 2023 22:23
@@ -596,7 +596,108 @@ fn slope(x_y: Vec<(u64, u64)>) -> f64 {
.map(|(x, y)| (*x as f64 - avg_x) * (*y as f64 - avg_y))
.sum();
let sq_x: f64 = x_y.iter().map(|(x, _)| (*x as f64 - avg_x).powi(2)).sum();
sum_x_y / sq_x
sq_x / sum_x_y
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct for the least squares method of linear regression, it should be sum_x_y / sq_x:
image
https://en.wikipedia.org/wiki/Simple_linear_regression#Fitting_the_regression_line

Copy link
Collaborator Author

@xgreenx xgreenx Aug 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but we need value for y, not x. And it is the same behavior as before(after slope we divided 1/slope(x_y)

Copy link
Member

@Voxelot Voxelot Aug 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is it inverted because our dep_per_unit is also an inverted slope? Might be worth putting a comment explaining why this deviates from the standard formula.

edit: ok yeah I see this is just a simplification of the original step here
image

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment=)

// y = B * x
// |
// \|/
// 1 / B = x / y
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] could add a transformation to both sides of the equation like (rhs)^-1 = (lhs)^-1

let last = x_y.last().unwrap();

let near_linear = linear_regression * NEAR_LINEAR;
let expected_type = if (linear_regression - first.amount()).abs() < near_linear
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more sound approach for testing the goodness-of-fit using linear regression would be the $R^2$ metric, or coefficient of determination. Where 1 represents a perfect fit, and lower values indicate a poor fit of the linear model.

image

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use it here, but we don't have cases of linear behavior, so the super-complex formula is overkill. Because each opcode includes the base execution time, it looks like a logarithmic almost in all cases.

Plus, we still need to play with the threshold for R. The code for the logarithm behavior should cover almost all cases, and the min strategy means we can only overcharge user. If later, with benchmarking target block time, we will see that some opcodes are really overcharged, then we can try to improve the algorithm. But right now, it looks like our benchmarks produce undercharge costs.

}
})
.map(|p| p.amount())
.min_by(|a, b| a.partial_cmp(b).unwrap())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this potentially leave a range of inputs somewhere between the base and the min(p.amount()) where we undercharge for execution time? would it be safer to just take the slope off the base, even if it overcharges?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also nit - isn't this a more complicated expression of simply using .min() here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Floats don't support min and you need to use partial_cmp =(

Remember, the lower value of amount means we charge the user more because of args / dep_per_unit. So we will only overcharge user, not undercharge.

base = *p;
true
} else {
false
Copy link
Member

@Voxelot Voxelot Aug 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we return some kind of error if we never find a base using the slope comparison? Otherwise it will just default to the first item which could be a very small base fee and be very inadequate for charging users.

For example if no base is found and the first point is x=10, and then we use the lowest amount from the high end of the points like x = 1,000,000 this could severely undercharge.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be possible. Our function is linear if the initial value is around the predicted value. So the initial price is already fair enough in this case.

@xgreenx xgreenx merged commit 2ec0094 into master Aug 4, 2023
@xgreenx xgreenx deleted the feature/gas-calibration branch August 4, 2023 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants