Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separation of Constant, Preset and Configuration variables #2390

Merged
merged 33 commits into from
May 21, 2021
Merged

Conversation

protolambda
Copy link
Contributor

@protolambda protolambda commented May 6, 2021

Feedback welcome. The idea is to make customization more consistent and reliable between clients, without any breaking changes.

Issues

Currently, as someone who configures many local tests, testing and testnets, I face the following issues:

  • The configuration file format is not documented well enough
  • Not all of the clients can read all different config inputs. Differences exist and are not documented.
  • Some configuration is unreliable. There is no special effort in fuzzing different weird or completely unreasonable combinations of configuration.
  • The configuration mixes up chain configuration (state sizes, validator/committee configuration) and network configuration (genesis, fork versions, fork boundaries, time parameters).

And as an implementer of the Spec, I face the following issues:

  • Some configuration literally never changes. For example, signing domains should be constants (signing also includes the fork digest, so chains are already unique). Dynamic configuration for some of these things is inconvenient.
  • Some configuration is dangerous if configured inconsistently. There are implementation assumptions between the relation of many of the constants. For mainnet and minimal configurations these are well known, but e.g. setting a target-committee size to 1 would likely not go well.
  • Some configuration affects the size and shape of the SSZ types. Due to optimizing things, or using stricter typing for compile-time guarantees, this configuration can only be applied during compile-time.

Current client behavior

  • Lighthouse:
    • Supports both mainnet and minimal in the same build
    • Fails to load other configurations which are not compatible for compile-time parts of the config, and relies on CONFIG_NAME (recently removed for other reasons) to detect the base config
  • Nimbus:
    • Had (has?) an undocumented JSON configuration feature to override the more dynamic parts of configuration during runtime. (I couldn't get this to work recently)
    • Supports "arbitrary" configurations, but only as part of the build-process. This makes repeated local short-lived testnets very difficult.
  • Prysm:
    • Has build variants for mainnet and minimal SSZ values
    • Supports "arbitrary" configuration loading, but may fail if compile-time SSZ code does not match
  • Teku:
    • Supports relatively dynamic configuration. Java is does not have the same constraints of compile-time variables
    • Like others, does not document which config var can be changed to what extent

New proposal

Without changing the meaning of any of the configuration variables, we organize them in 3 types of variables:

Constants

We retain all existing constants, and freeze some additional configuration:

  • All DOMAIN_... variables: signing is already unique thanks to the fork-digest part. Various external tooling/smart-contracts also need stability here.
  • Withdrawal prefix bytes: not necessary to modify between different chains

Preset

A preset is defined as a subset of the previous configuration, bundled together statically, intended for different operation modes: mainnet: for full mainnet-like operation, minimal: faster/simplified local testing, etc.

A preset:

  • Reduces the compile-time build targets to 2: mainnet and minimal
  • Preserves consistency between core chain configuration variables
  • Enables comile-time optimization and security improvements
  • Makes the runtime configuration much more lightweight: just change PRESET_BASE as user. The implementation can then simply check if this preset is allowed.
  • Contains all variables that change any of the BeaconState or BeaconBlock types, but still need to be configurable for testing reasons.

Configuration

A configuration is a set of runtime-configurable variables that enable users to set up custom local/test networks.

A configuration specifies a preset with PRESET_BASE. This can be either mainnet, minimal, or optionally refer to a preset provided by the client implementation.

For new testing of forks, we want to start defining this configuration per testcase. Testing the fork-boundaries is really important, and this enables us to do that, without re-compiling the client for every test case.

The configuration contains:

  • Genesis parameters
  • Fork versioning and fork boundaries for different network instances
  • Special boundaries like the Merge transition POW difficulty
  • The slot time, to speed up or stress local testing
  • Eth1 related configuration: follow distance, chain/network ID, deposit contract address
  • Churn limits, withdrawal, and exit times: for testnets that like to onboard validators quicker, and forcefully throw out inactive validators faster.
  • Maybe in the future: P2P Network related parameters (TODO: TBD! Can help with some stress testing of clients, necessary when changing the slot time)

TODO

  • Update specs:
    • Phase0
    • Altair
    • Merge
    • Sharding
    • DAS
    • Custody game
  • Update configs
    • Document PRESET_BASE
    • Update config files to separate preset and configuration
  • Update pyspec
    • Update the way configuration is loaded Pyspec build outputs for each preset
    • Update pytest to handle new config module
    • Update test generators to select configs/presets properly.
    • Update the way configuration is added to the test-vector outputs
      • yield config data from within config decorator
      • document config format for tests
    • Test new improved config loading when using eth2spec as python package
    • Double check test generator output

Later TODOs

  • Apply to more altair tests
  • Backport the configs of live testnets: Pyrmont and Prater

Pyspec changes

  • The spec builder reads the preset YAML files in addition to the markdown, to build mainnet.py and minimal.py instead of a spec.py that needed reloading.
  • The config variables can stay globals in the markdown (for now at least), the spec builder translates FOO_BAR to config.FOO_BAR
  • Config variables are not used in types, only referenced in functions. Use constants or presets for typing.
  • The test decorators can swap out the config value of a spec module temporarily, to change the config efficiently.
  • with_config_overrides and spec_configured_state_test can be used to configure individual spec tests
  • No module reloading necessary anymore

@mcdee
Copy link
Contributor

mcdee commented May 6, 2021

Just a thought, but is minimal so different from mainnet that it has significant value? Could minimal be discarded entirely without significant impact on testing?

The main reductions in minimal compared to mainnet seem to be bulk and periodicity. Bulk relates to the number of validators and related items (committees etc.) and it doesn't seem that mainnet parameters are unreasonable for testing. Periodicity relates to the frequency of changes (slots per epoch, epochs per voting period etc.) and could be more important for transition testing, but are tests in these areas common enough to warrant the complexity of a second config?

Note that I haven't done much testing that has required switching to minimal, so it may be that there are very good reasons for having it. Just thought that as the issue is being looked at it's worth asking the question.

@protolambda
Copy link
Contributor Author

@mcdee

is minimal so different from mainnet that it has significant value? Could minimal be discarded entirely without significant impact on testing?

For spec-testing, yes it is of significant value. But outside of spec-testing, not really. This PR tries to address that: we enable deeper customization during compile-time for testing purposes, but remove the clutter and client-inconsistencies from the standard config. The problem with the mainnet configuration is that it increases the state size by a lot, and slows down things by a lot. Bulk and Periodicity indeed. Running days worth of mainnet-state-transitions in CI takes ages, especially in the unoptimized python spec version. The minimal configuration allows us to run through every spec test, on every change to the specs, catching a lot of issues before they hit implementers.

it doesn't seem that mainnet parameters are unreasonable for testing

Slight disagree here. We do have some mainnet testing (it's a simple --config=mainnet in pytest), and do output test-vectors for it, but there have been exceptions where computation or size are too prohibitive to produce a testcase with mainnet configuration. Having the option of a more minimal standardized variant is very useful. And clients can quickly run the minimal config version of tests, before they run the mainnet version.

Periodicity relates to the frequency of changes (slots per epoch, epochs per voting period etc.) and could be more important for transition testing, but are tests in these areas common enough to warrant the complexity of a second config?

They may not be common, but I think they are important enough to warrant it. And 1 compile-time variant with different constants is not unreasonable complexity to me. This PR attempts to make the separation with runtime configuration clear, to reduce any complexity leak there.

@protolambda
Copy link
Contributor Author

Don't think we should include this in the next release yet. I am happy with the new config approach, but the pyspec part should improve more. The goal is to decorate test cases with fork versions / fork boundaries, and although it enables that, I'm not happy with the spec reloading part.

@protolambda
Copy link
Contributor Author

Rebased onto latest dev.

The current plan is to update the pyspec builder to output a mainnet.py and minimal.py for each phase, with presets loaded at compile-time, like clients. And then put the configuration variables either in a config dict (and reference with e.g. config.ALTAIR_FORK_VERSION in spec) or translate those runtime config references. This way we:

  • Don't need python module reloading anymore (yay speed)
  • Can easily swap fork versions and such per test case.
  • Minimal spec / test changes

And, after the spec clarifies the (relatively small) subset of runtime configuration, we can specify runtime-overrides per test case, to properly test different config values, and ease the fork scheduling problems (the stub values for altair fork version should be overridden in block processing tests, so Altair is actually active)

Comment on lines 24 to 32
# Altair
ALTAIR_FORK_VERSION: 0x01000000
ALTAIR_FORK_EPOCH: 18446744073709551615
# Merge
MERGE_FORK_VERSION: 0x02000000
MERGE_FORK_EPOCH: 18446744073709551615
# Sharding
SHARDING_FORK_VERSION: 0x03000000
SHARDING_FORK_EPOCH: 18446744073709551615
Copy link
Contributor

@hwwhww hwwhww May 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow up #2323 (review)
To distinguish between the "configs for test vectors" context and "configs for the network" context, it would be nice to make *_FORK_EPOCH live in a separate file.

What do you think about having a mainnet_fork.yaml which is only for *_FORK_EPOCH?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's wait till further refactoring. I'm updating the test cases etc. to support "overrides" first: just write which piece of runtime-configuration is different, per test case. We can make fork testing much easier this way. And I honestly like the single config file better: if it's not too large, and runtime-only, it's just a lot more ergonomic for testnet configuration work.

@protolambda protolambda marked this pull request as ready for review May 18, 2021 19:04
Copy link
Contributor

@djrtwo djrtwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! Really ripples deep in there...

Generally okay with the approach. Need to make sure it's easy for client teams to digest wht actually changed here maybe with a cleaned up header comment. The diff isn't particularly useful to them.

I had a few questions about config vs preset on some vars. Check it out. Happy to debate

configs/mainnet_config.yaml Outdated Show resolved Hide resolved
# 2**8 (= 256) epochs ~27 hours
MIN_VALIDATOR_WITHDRAWABILITY_DELAY: 256
# 2**8 (= 256) epochs ~27 hours
SHARD_COMMITTEE_PERIOD: 256
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little bit surprised SHARD_COMMITTEE_PERIOD made it in the network config values rather than in a preset. Can you explain your reasoning?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used in phase0 for the validator exit queue, so if a testnet wants to allow quick validator exits, it needs to modify this. I agree that for sharding itself it doesn't make a whole lot of sense to not put it in the preset. Maybe we should split the two config variables? I.e. define VALIDATOR_EXIT_DELAY, and update phase0 without changing the value?

configs/mainnet_preset/altair.yaml Outdated Show resolved Hide resolved
configs/minimal_preset/phase0.yaml Outdated Show resolved Hide resolved
| `DOMAIN_CONTRIBUTION_AND_PROOF` | `DomainType('0x09000000')` |
| Name | Value | Description |
| - | - | - |
| `INACTIVITY_SCORE_BIAS` | `uint64(4)` | score points per inactive epoch |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain your choice of making these values configurable?

They seem somewhat similar to the penalty values above that are in presets

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make the phase0 variants of inactivity vars configurable, but I imagine we'll have "duplicate" config vars then, and they won't matter for testnets that start post-altair.

configs/minimal_config.yaml Outdated Show resolved Hide resolved
setup.py Outdated Show resolved Hide resolved
tests/core/pyspec/eth2spec/utils/hash_function.py Outdated Show resolved Hide resolved
tests/core/pyspec/eth2spec/test/context.py Show resolved Hide resolved
Copy link
Contributor

@djrtwo djrtwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix markdown header

presets/README.md Outdated Show resolved Hide resolved
Co-authored-by: Danny Ryan <dannyjryan@gmail.com>
Copy link
Contributor

@hwwhww hwwhww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts, not strongly:

  • 👍 on more structural pyspec
  • 👍 on making test case variable configurable by decorator. Nice work!
  • [❓] A little concerned about maintainability.
    • Variables might be moved between config and preset later when we need.
    • configs/mainnet.yaml will contain the configuration cross forks, including the ones in active R&D. IIRC, the client devs don’t like unstable config file that includes variables for the future forks.
    • 👎 on some variables are spec.VARIABLE_NAME and some are spec.config.VARIABLE_NAME in tests. But maybe I will get used to it.
  • [❓ ] The specs were written for the client devs mostly, but it also serves general readers to understand Eth2 protocol. This change exposes more implementation details by separating config and preset.

I'd defer to client devs to make the call. 😄

configs/README.md Outdated Show resolved Hide resolved
tests/core/pyspec/eth2spec/utils/hash_function.py Outdated Show resolved Hide resolved
setup.py Outdated Show resolved Hide resolved
@protolambda
Copy link
Contributor Author

protolambda commented May 19, 2021

Diff of test generation with dev commit, base of this PR: https://gist.github.com/protolambda/48349d01e6101e6f433bcae703caddc4

  • Altair/Merge finality tests were missing on dev
  • Fork tests were missing on dev due to error (recently fixed in Add Altair fork tests to test generators #2427 )
  • The attester slashing test success_with_effective_balance_disparity is different for some reason, investigating that one.
    • update: that one test uses RNG without explicit seeding, resulting in different outputs each time. I'll make the seeding explicit, so we don't run into different test outputs each time.

@protolambda protolambda changed the title Separation of Constant, Preset and Configuration variables (Work In Progress) Separation of Constant, Preset and Configuration variables May 19, 2021
Copy link
Member

@ralexstokes ralexstokes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a lot in here but generally supportive of the change!

ultimately defer to various client teams on how to handle this but do like differentiating the more/less statics parts of the "config"

Copy link
Contributor

@michaelsproul michaelsproul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favour of clearly defining the config file semantics, and I think this PR does a good job of that.

I skimmed through Lighthouse's compile-time config (EthSpec) and runtime config (ChainSpec) to see how it matched up and didn't spot any major issues. There are just a few consts/preset-level values in the ChainSpec, which I've documented below.

I don't think we have the appetite for a large refactor at the moment, so if this PR is adopted we'll likely continue mapping standard configs onto Lighthouse's internal representation, and then slowly migrate towards using the standard internally.

Other minor points:

  • There's no central definition of constants, which could be useful to have
  • I agree with @hwwhww that having clarity about what forks are being parsed from a config file is desirable. There aren't any Altair/merge/sharding params in the config right now, would the intention be to mix them in with the phase0 params, or have them in separate files like the presets?
  • Backwards-compatibility for the /eth/v1/config/spec HTTP API is also something to consider. Right now Lighthouse & Teku interoperate OK using the current config. If this PR is adopted I think the best chance of achieving backwards compatibility will be to require the API endpoint to returning the union of the preset & config?
ChainSpec differences

Consts in ChainSpec:

  • genesis_slot
  • far_future_epoch
  • base_rewards_per_epoch
  • deposit_contract_tree_depth
  • min_per_epoch_churn_limit
  • churn_limit_quotient
  • bls_withdrawal_prefix_byte
  • all the domains

Preset params in ChainSpec:

  • max_committees_per_slot
  • target_committee_size
  • shuffle_round_count
  • hysteresis_quotient
  • hysteresis_downward_multiplier
  • hysteresis_upward_multiplier
  • proportional_slashing_multiplier
  • min_deposit_amount
  • max_effective_balance
  • ejection_balance
  • effective_balance_increment
  • min_attestation_inclusion_delay
  • min_seed_lookahead
  • max_seed_lookahead
  • min_epochs_to_inactivity_penalty
  • base_reward_factor
  • whistleblower_reward_quotient
  • proposer_reward_quotient
  • inactivity_penalty_quotient
  • min_slashing_penalty_quotient
  • safe_slots_to_update_justified

@ajsutton
Copy link
Contributor

Definitely makes sense to me - being able to setup custom testnets more easily is definitely a win.

This doesn't really make any difference for Teku as all the config options from the spec are completely dynamic for us anyway (ie minimal and mainnet are both fully supported in all builds, and we actually have a couple of other variants we use to make it possible for acceptance tests to finalise quickly). So the change for us ultimately just boils down to loading the values from more files and supporting the "PRESET_BASE" field to specify which preset file to mix in.

@michaelsproul makes a good point about /eth/v1/config/spec - for compatibility we should flatten the options and then I think it winds up producing the same output. It's probably worth including the additional PRESET_BASE value in the output as then you can create the new config file out of the response by just ignoring anything that's now in PRESET. We may want to consider a /eth/v2/config/spec endpoint which specifically separates the preset and config values.

@wemeetagain
Copy link
Contributor

While the preset vs config distinction could be formulated at the implementation-level, rather than spec-level, I think including this in the spec gives more clarity about how much certain params are expected to change and will help with more easily (and consistently) setting up testnets across clients.

I think this will be a big win for Lodestar, which currently has teku-level of dynamic behavior. We can likely simplify a lot of our code and improve devex once our ssz types are known ahead of time and no longer depend on run-time config.

Copy link
Contributor

@djrtwo djrtwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

putting it in alpha.6. great work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants