Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce gas weights to be specified alongside parameters #8264

Closed
Tracked by #8032
jakmeier opened this issue Dec 21, 2022 · 8 comments
Closed
Tracked by #8032

Introduce gas weights to be specified alongside parameters #8264

jakmeier opened this issue Dec 21, 2022 · 8 comments
Assignees
Labels
A-params-estimator Area: runtime params estimator A-transaction-runtime Area: transaction runtime (transaction and receipts processing, state transition, etc) T-contract-runtime Team: issues relevant to the contract runtime team

Comments

@jakmeier
Copy link
Contributor

jakmeier commented Dec 21, 2022

Status Quo

For parameter weights, we need to somehow define what these are. The simplest solution seems to assume by default all weights are 1 and we only explicitly define weights that are not 1.

How we use parameters is described briefly here: https://github.com/near/nearcore/tree/master/core/primitives/res

A more detailed explanation can be found here: https://near.github.io/nearcore/architecture/gas/parameter_definition.html

Parameters are currently defined using (almost) the format described in option 1 here: #6710

It's a simple custom format. On the parsing side, there are a few quirks to consider, however.

  • Parsing is done to a list of String -> JSON mappings, stored in ParameterTable.
  • Version diffs are stored in ParameterTableDiff. It allows adding new parameters or even removing old parameters. But because we still need to support the JSON format on a RPC calls, and the JSON is defined by a Rust struct, we actually need to define the same set of parameters for all versions. This is done either using serde default values, or by defining the parameter in the base txt file.
  • There is a logical connection for *_send_sir, *_send_not_sir and *_execution, such as in the example below:
action_delete_key_send_sir: 94_946_625_000
action_delete_key_send_not_sir: 94_946_625_000
action_delete_key_execution: 94_946_625_000

The parsing code relies on this specific naming, for example here:

fn fee_json(&self, key: FeeParameter) -> serde_json::Value {
json!( {
"send_sir": self.get(format!("{key}_send_sir").parse().unwrap()),
"send_not_sir": self.get(format!("{key}_send_not_sir").parse().unwrap()),
"execution": self.get(format!("{key}_execution").parse().unwrap()),
})
}
}

(What is this SIR anyway?)

TBD

Syntax of parameter definition

How should we define parameter weights? Conceptually, these are just a set of new parameters.

  • One option is to double the number of parameters and just slap a _weight suffix to all existing parameters. That expanding the hack we have done for send_sir / send_not_sir / execution. I don't like it but it's an option.
  • An alternative would be to add more information on the same line. For example
action_delete_key_send_sir: 94_946_625_000
action_delete_key_send_not_sir: 94_946_625_000
action_delete_key_execution: 94_946_625_000         (weight=1.5)

The example above would add a weight to the execution cost of deleting a key.
This is also hacky and it would make the parsing logic quite a bit more complicated. But it wouldn't blow up the list of all parameters too much. And it might be scripting friendly, as a single line is enough to know he weight adjusted parameter value.

Semantics of parameter definitions

There are open questions around the semantics that we need to define. Here is a list that should be extended as more questions come up.

  • Should we allow defining a weight for a parameter that is not a gas cost?
  • What data type is the weight? (1.1 in a f64 cannot be represented fully accurate, so either we should define if we want f32, f64, or something else like Ratio.)
  • Similar to the last point, we must define the multiplication used. (Gas is defined as u64.)
@jakmeier jakmeier added A-transaction-runtime Area: transaction runtime (transaction and receipts processing, state transition, etc) A-params-estimator Area: runtime params estimator T-contract-runtime Team: issues relevant to the contract runtime team labels Dec 21, 2022
@aborg-dev
Copy link
Contributor

Thanks for detailed description, Jakob!

I'll start with the following implementation:

  • Extend the config file to allow specifying weight on the same line as the parameter [1]
  • Use Ratio type to represent weights
  • Multiply ratio and u64 in ratios and round up the resulting gas cost when converting to u64
  • I'm not sure about the part about weights for parameters other than the gas cost, but my first hunch is not doing this until we need it, please let me know if there is already a known use-case here

[1] I'll be interested to re-evaluate the decision to use custom text format (instead of json that was used before) for representing the config, but that could wait until the time we need other extensions in the config that require larger changes to offset the cost of another migration

@jakmeier
Copy link
Contributor Author

jakmeier commented Jan 3, 2023

Sounds good!

[1] I'll be interested to re-evaluate the decision to use custom text format (instead of json that was used before) for representing the config, but that could wait until the time we need other extensions in the config that require larger changes to offset the cost of another migration

Yes we can discuss this. To give you some context, the main reason at the time was that diffs were really hard to read. And the secondary reason was that iteration over parameters was impossible. The flat custom format seemed like a useful step to make at the time. But maybe we could achieve the same goals with a more standard format like JSON. That is, I think the main problem was how things were arranged inside the JSON, and how it was coupled with our code, not the fact that it was JSON.

@aborg-dev
Copy link
Contributor

I've looked into migrating to a more extensible and standard format for representing the parameters file, specifically into using YAML and TOML. Overall both formats are good from human readability perspective and would make it easy to add the weight parameter. YAML is much closer to the current parameters.txt representation, so that's what I'm investigating deeper right now.

Here is an example change for migrating parameters.txt to YAML: 68ce635
Note, that the actual storage of parameters in ParameterTable does not change at all. I think there is an opportunity to get rid of serde_json::Value there altogether in favor of a more specialized enum (see struct example below) but that could be done incrementally.

Things that we gain after migrating to YAML:

  • Extensibility to represent weights and some other structured parameters:
burnt_gas_reward: { numerator: 3, denominator: 10 }

action_receipt_creation_send_sir: { base: 108_059_500_000, weigth: { numerator: 15, denominator: 10 } }

# Or even this.
action_receipt_creation: {
  send_sir: { base: 108_059_500_000, weigth: { numerator: 15, denominator: 10 } }
  send_not_sir: 108_059_500_000
  execution: 108_059_500_000
}

Arguably, we have this with current custom format as well, but in the case of YAML we won't need to write text parsing, only the struct that we expect this to map to:

struct Ratio {
  numerator: u64,
  denominator: u64,
}

enum ParameterValue {
  Number(u64),
  Ratio(Ratio),
  WeightedNumber {
    base: u64,
    weight: Ratio,
  },
  String(String),
}
  • Diff configs also become standard YAML:
# Define value.
action_deploy_contract_per_byte_execution: { new: 64_572_944 }

# Replace value.
action_deploy_contract_per_byte_execution: { old: 6_812_999, new: 64_572_944 }

# Delete value.
action_deploy_contract_per_byte_execution: { old: 6_812_999 }

# Update weight.
action_receipt_creation_send_sir: {
  old: { weigth: 1 }
  new: { weigth: 10 }
}

The downsides of using YAML are:

  • Does not support numeric literals with separators - we'll need to use a function similar to parse_parameter_txt_value for post-processing the parsed YAML. Right now these numeric literals will be parsed as strings
  • Rust serde_yaml crate does not support 128-bit integers - this could change in the future and will simplify our life, but for now we need to resort to storing them as strings similarly to current implementation. This also means that 128-bit integers need to be escaped as strings in the config file as otherwise YAML parser complains about not being able to parse an integer

@jakmeier , would love to hear your opinion on whether this refactoring makes sense and I'm not missing any other important requirements

@jakmeier
Copy link
Contributor Author

jakmeier commented Jan 6, 2023

Thanks for the detailed write-up!

I like the proposal, for several reasons. Having yaml instead of a custom format seems better to me if we want to introduce some structure. And since we (probably) want to have parameter weights soon, preparing this is good timing right now. Oh and I actually like the strings without "" better anyway, so that's a positive change, too.

I also like the enum ParameterValue type that specifies which values we allow in a nice custom format, such as delimiters. But it does mean it will be non-standard yaml. So I'm a bit worried about how scripting on the file would be done. Right now it is simple enough that any scripting language (I'm a huge fan of awk for such things) works. Writing it in yaml, I'm not sure but probably there is an equivalent to what jq is in the json world. But the custom formats will not be supported by such a tool anyway.

However, I don't think anyone is scripting on parameters anyway. (Aside from me occasionally in the past, but I can just use the Rust parsing logic you are about to introduce.) So I'm in favor of moving forward with your proposal and refactor it to use YAML.

Last comment, have you looked at RON as format? It's not very common but it fits quite well into a Rust code base IMO, and it allows comments and numbers like 1_000 out of the box. Could be interesting to at least consider RON instead of YAML.

@aborg-dev
Copy link
Contributor

Cool! I have sent a PR with refactoring to YAML.

I believe our config will still be a standard YAML, I was not planning to introduce any specialized text-parsing logic, only re-interpreting already parsed yaml::Value type into a more restricted type defined by us. So all the standard YAML tools should work, e.g. yq tool works:

yq 'with_entries(select(.key | test("^wasm_")))' core/primitives/res/runtime_configs/parameters.yaml

Also, thanks for pointing out RON, I gave it a try [1] and it indeed simplifies the integers parsing
at the cost of more noisy text format (map keys in quotes, comma separators). The problem with
128-bit integers still holds. I still think YAML is a better choice here, but we can re-evaluate
this decision if we need to store more complex structures.

[1] ef39e9d

@jakmeier
Copy link
Contributor Author

Yeah keeping it parseable by standard tools makes sense. I never used yq but good to learn about it!

I assume if I want to to a product-sum of some sort with yq and the new format, I will need to get rid of the _ somehow. That's what I meant by non-standard format. But it should not be too much of a problem. I already had this step in previous awk script, which was quite easy as the old format was "awk compatible" to split into fields and removing _ in the second field is also easy. But I am sure with the new format I can achieve the same thing with sufficiently complex regular expressions and/or standard text processing unix tools. :-)

near-bulldozer bot pushed a commit that referenced this issue Jan 11, 2023
This is a refactoring PR to enable #8264. For full motivation see #8264 (comment)

This PR only changes the disk format and parsing logic but doesn't change the in-memory representation, so it should not lead to any visible behaviour changes.

The next step in this refactoring would be to investigate getting rid of `serde_json::Value` in favor of `serde_yaml::Value` or even better a specialized struct with much more restricted types to catch invalid config errors earlier.
nikurt pushed a commit to nikurt/nearcore that referenced this issue Jan 15, 2023
This is a refactoring PR to enable near#8264. For full motivation see near#8264 (comment)

This PR only changes the disk format and parsing logic but doesn't change the in-memory representation, so it should not lead to any visible behaviour changes.

The next step in this refactoring would be to investigate getting rid of `serde_json::Value` in favor of `serde_yaml::Value` or even better a specialized struct with much more restricted types to catch invalid config errors earlier.
near-bulldozer bot pushed a commit that referenced this issue Jan 16, 2023
This PR simplifies parameter parsing code by removing JSON usage and instead directly storing parameter values as YAML. Functionally, the code should behave the same for the users of `ParameterTable`.

This is a part of refactoring for #8264.

I'm still pondering using a more specialized enum type to store parameter values, but not yet sure how the implementation will look like, so delaying this until I start implementing parameter weights in config.
@aborg-dev
Copy link
Contributor

aborg-dev commented Jan 18, 2023

I've recently investigated migrating the ParameterTable from being a Map<Parameter, yaml::Value> into a struct

struct ParameterTable {
  first_parameter_name: FirstParameterType,
  second_parameter_name: SecondParameterType,
  third_parameter_name: ThirdParameterType,
  ...
}

The idea was that this will make parsing more robust as we will check that parameter values have correct types during parsing and also will simplify our work with u128 which is not supported by yaml::Value but could be parsed by YAML into an u128 field.

Unfortunately, this approach requires large amount of Macro-programming to enable the following features:

  • Defining the ParameterTableDiff structure based on fields in ParameterTable
  • Defining apply_diff function that needs to iterate over all fields
  • Defining the Parameter enum and the method to lookup a field by the enum value (this is not strictly necessary, but would be nice to have to simplify the migration)

Ultimately, my conclusion is that this is possible, but would make extending and debugging this code more difficult and so is not worth it.

near-bulldozer bot pushed a commit that referenced this issue Jan 19, 2023
This is a part of #8264.

This PR introduces structured values for Rationals in the parameter config in the form
```yaml
burnt_gas_reward: {
  numerator: 3,
  denominator: 10,
}
```

To support this in the code, we now interpret parameter values as one of
```rust
enum ParameterValue {
    Null,
    U64(u64),
    Rational { numerator: isize, denominator: isize },
    String(String),
}
```

In the future the plan is to extend this to support Fee structure and Weighted parameters.

One open Rust-related question here:
- [ ] What is a good way to deduplicate code between `get_number`, `get_string`, `get_rational` methods?

Things to finish:
- [x] Tests for the Rational type
nikurt pushed a commit to nikurt/nearcore that referenced this issue Jan 30, 2023
This is a part of near#8264.

This PR introduces structured values for Rationals in the parameter config in the form
```yaml
burnt_gas_reward: {
  numerator: 3,
  denominator: 10,
}
```

To support this in the code, we now interpret parameter values as one of
```rust
enum ParameterValue {
    Null,
    U64(u64),
    Rational { numerator: isize, denominator: isize },
    String(String),
}
```

In the future the plan is to extend this to support Fee structure and Weighted parameters.

One open Rust-related question here:
- [ ] What is a good way to deduplicate code between `get_number`, `get_string`, `get_rational` methods?

Things to finish:
- [x] Tests for the Rational type
near-bulldozer bot pushed a commit that referenced this issue Mar 21, 2023
**UPDATE:**
This PR has been updated to match the description of Compute Costs NEP: near/NEPs#455

Concretely, now we specify compute costs directly without weights indirection.

--------- 

This PR addresses #8264, in particular allowing to add parameter weights for Storage Costs.

Some open questions and potential improvements:
- [X] Do we need to surface weights in the `views.rs` (RPC interface of the node?)
- [X] Ideally we would avoid specifying the old base value in the diff config if we only want to change the weight. For now we need to do:
```yaml
wasm_storage_read_base: {
    old: { base: 50_000_000_000, weight: 3 },
    new: { base: 50_000_000_000, weight: 7 }
}
```
@aborg-dev
Copy link
Contributor

This issue is resolved with #8426, so closing it given that we have a higher-level tracking issue #8032.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-params-estimator Area: runtime params estimator A-transaction-runtime Area: transaction runtime (transaction and receipts processing, state transition, etc) T-contract-runtime Team: issues relevant to the contract runtime team
Projects
None yet
Development

No branches or pull requests

2 participants