-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test Definition Format: (Fork Choice Immediate Message Driven GHOST) #420
Comments
What is RPJ? |
So the task is to come up with some language agnostic format for describing tests. Given the example in the first link:
We understand the the first values are the test configuration:
message Configuration {
int64 validator_count = 1;
int64 epoch_length = 2;
int64 shard_count = 3;
int64 minimum_committee_size = 4;
} And then we have a list of validator actions that should occur to test the fork choice.
Finally, we have the chain head definition. Open question:
message ValidationVoteCondition {
repeated int64 validators = 1;
string object_id = 2;
} message Object {
string id = 1;
string parent_id = 2;
bool is_genesis = 3;
} message ForkChoiceTest {
Configuration config = 1;
repeated Object objects = 2;
repeated ValdiatorVoteCondition vote_conditions = 3;
string expected_head_object_id = 4;
} Full # Test configuration
config {
validator_count: 100
epoch_length: 8
shard_count: 32
minimum_committee_size: 8
}
# Object definitions
objects {
id: "*"
is_genesis: true
}
objects {
id: "A"
parent_id: "*"
}
objects {
id: "B"
parent_id: "A"
}
objects {
id: "C"
parent_id: "A"
}
objects {
id: "D"
parent_id: "C"
}
objects {
id: "E"
parent_id: "D"
}
# Validator vote condition definitions
# Slot 0
vote_conditions {
validators: 0
validators: 1
validators: 2
validators: 3
validators: 4
validators: 5
object_id: "A"
}
# Slot 1
vote_conditions {
validators: 0
validators: 1
validators: 2
validators: 3
validators: 4
validators: 5
object_id: "B"
}
# Slot 2
vote_conditions {
validators: 2
validators: 3
validators: 4
validators: 5
validators: 6
validators: 7
object_id: "B"
}
# Slot 3
vote_conditions {
validators: 2
validators: 3
validators: 4
validators: 5
validators: 6
validators: 7
validators: 6
validators: 7
object_id: "B"
}
# Slot 4
vote_conditions {
validators: 0
validators: 1
validators: 2
validators: 3
validators: 4
object_id: "D"
}
# Slot 5 is empty so not defined
# Slot 6 is split into two conditions
# Slot 6, condition 0
vote_conditions {
validators: 0
validators: 1
validators: 2
validators: 3
validators: 4
object_id: "E"
}
# Slot 6, condition 1
vote_conditions {
validators: 5
validators: 6
validators: 7
object_id: "B"
}
# Final expected head object
expected_head_object_id: "E" The above could also be represented in JSON format with the same data. I have chosen protobufs due to their type safe nature. The key advantages here over a parsable text format is that this is already understood by most languages and a parser does not need to be implemented by each language. There might be a scenario where the parser is incorrect and gives false positives to fork choice rule tests. |
This is great. Thanks a lot for the comprehensive thoughts! @prestonvanloon
I see is slot ordering can be used for certain shorts. For example, [2] means validators defined in slot 2. [=6] means to enforce that this particular definition is at slot 6 explicitly.
|
A slot is referring to a segment of time? Why? My other question is why reference another slot rather than denormalize the data? These slot references or concatenations add additional complexity. I don’t see any time related functionality here. Rather we are establishing the chain state-of-the-universe for the fork choice rule to decide the head. |
@prestonvanloon Agree on the slot references, they needlessly complicate things and I do not really see the advantage of having them, why not just explicitly state what the data is. |
Slots are actually important to this construction. The current shuffling of validators ( I haven't reviewed your notes in detail yet. On the run currently! Wanted to drop that piece of info in, and I'll come back with more notes later. |
Maybe we could define what a slot actually is and how they are related, if at all, and why are they related? If slots are related to shuffling, are we assured that the shuffling is deterministic? |
Because of the epoch-less casper consensus rules, a validator attesting to a block at slot N is a casper vote for the block at slot N and the blocks in the previous CYCLE_LENGTH-1 slots (voting on a total of CYCLE_LENGTH slots). If there are empty blocks for some of these slots that are to be voted on by an attestation, then it is simply skipped. Note, it is not the previous CYCLE_LENGTH blocks that receive votes, but instead slots. This notion of an attestation casting votes for ancestor blocks is key in assessing both the fork choice and the rules around justification/finalization. For the purposes of these tests, an RNG for shuffling can be assumed and is thus deterministic. The exact shuffling is unimportant, but rather the notion of only a subset of validators being able to attest at each slot is important. Say there are 100 validators and 10 slots, there are then 10 validators per slot with no overlap. So you would say, build a block B at slot 0 and have validtaors 0-9 attest. If you did the same at slot 1, the validators 0-9 attesting to slot 1 will be entirely different from the validators at slot 0. This again is important when assessing justification/finalization as you count total votes on blocks from individual validators. I'm open to alternative schemes to write these tests out in, but I am pretty certain we cannot remove the notion of slots (and empty slots for that matter). If we move toward a scheme similar to the one you have proposed, I would toward json. |
Note: ChiCheng has moved forward with the string parser for now, but we can replace the parser and tests with a different format and still keep the guts the same when we decide on a more favorable format here. |
Link to @ChihChengLiang's PR: ethereum/beacon_chain#80 Why do we need shuffling here if this is testing the fork choice rule? My interpretation of this test is to build a scenario for the fork choice rule to select the proper head given the information provided to the method. To answer my earlier question: What is a slot and how are they related? @djrtwo answered offline as follows:
I don't understand why we need slots that are applied in a sequential order given that the test is to determine the final head at the end of the scenario. Given the above scenario of block heirarchy and some vote information. Example:
*Note: Expected behavior is just for understanding and is not directly tested in this framework. Key information for this method are whether or not the the validator was randomly selected during the epoch/slot where they should vote, what was their vote, and what is their index. |
We don't really need to be shuffling anything, but we do need the concept of only a subset of validators per slot (~ The fork choice rule is not simply based on the blocks that are built, but also on the votes the block has received. Votes come from attestations. An attestation of a block at slot N, applies votes to that block and all its ancestors that are in the previous Happy to hop on a call to explain. I think we might be having something lost in our text communication. |
The reason I went with a human-writable language is that it really does make life easier for a human to easily be able to crank out a test in 2 minutes without having to fiddle around with tools. I'd personally take JSON over protobuf text format, as most existing ethereum tests are in JSON; I suppose we could get the same gains as we had from the human-writable language by creating a translator? It would not be too difficult, basically the same code that clients would have had to implement to parse the language anyway. |
I agree with JSON. My suggestion for protobuf is to provide the type safety and schema for non-type-safe languages. However, protobuf text format is... not great. It's actually quite bizzare and hard to understand initially. This suggestion might be overkill or an imposition on others. Back to JSON: I'd imagine that it would be possible to parse the originally proposed language into the JSON and then we only need one implementation of the parser. I'm interested to hear if any other team has an opinion in tomorrow's call and maybe we can clarify the specific goals of the testing framework. @djrtwo If you have time, could we schedule a short call to hash out some of my questions? So the test data need to represent:
Is that right? Otherwise, how are we determining which of the 100 validators are assigned to a given slot? |
Following yesterday's call, reposting my comments from the beacon chain repo
In summary, human readable/writable format are good but it needs comment (which JSON doesn't support). I do prefer using a widespread format rather than a custom one we would need to write parsers for. I feel like TOML or YAML can fit all those concerns. Comparison of TOML and YAMLFrom https://gist.github.com/oconnor663/9aeb4ed56394cb013a20, with YAML fixed title = "TOML Example"
[owner]
name = "Tom Preston-Werner"
dob = 1979-05-27T07:32:00-08:00
[database]
server = "192.168.1.1"
ports = [ 8001, 8001, 8002 ]
connection_max = 5000
enabled = true
[servers]
[servers.alpha]
ip = "10.0.0.1"
dc = "eqdc10"
[servers.beta]
ip = "10.0.0.2"
dc = "eqdc10"
[clients]
data = [ ["gamma", "delta"], [1, 2] ]
hosts = [
"alpha",
"omega"
] title: YAML Example
owner:
name: Tom Preston-Werner
dob: 1979-05-27T07:32:00-08:00
database:
server: 192.168.1.1
ports: [ 8001, 8001, 8002 ]
connection_max: 5000
enabled: true
servers:
alpha:
ip: 10.0.0.1
dc: eqdc10
beta:
ip: 10.0.0.2
dc: eqdc10
clients:
data: [ [gamma, delta], [1, 2] ]
hosts:
- alpha
- omega And adding this TOML quirk, from strictyaml documentation: TOML [[fruit]]
name = "apple"
[fruit.physical]
color = "red"
shape = "round" vs YAML fruit:
name: apple
physical:
color: red
shape: round (StrictYAML is a type-safe restricted subset of YAML, for example strings must be quoted.) |
@mratsim YAML is a good compromise. +1. |
To shed some light on how this test language would be applied, let me paste here the chain test example from the deprecated hybrid Casper.
Note that in the example, the test language is agnostic to the client, database, or server. The tests focus on the chain data and usually test the consistency of the fork choice rule by [S]aving block hash, [R]everting to saved hashes, building a fork, and finally checking the desired [H]ead at the end of the string. With the test string, it is easy and quick to write a complicate testing scenario in a one-liner. |
My primary issue with the string format is that it is difficult to read or understand. It seems reasonable to me that we could come up with some mechanism to easily create test scenarios while making it easy to read and understand. This could be writing a string language that can generate the test code in strictYAML or a GUI if someone was so inclined to do so. |
Agreed on YAML being a good option. @ChihChengLiang We can build and support a simple python tool that converts the string representation into yaml output. I'll write up a new proposal taking into account @prestonvanloon's and @mratsim's notes/questions. |
Proposed YAML format: https://notes.ethereum.org/s/r11GVSBuQ Leave any comments here. We can discuss in depth at our Thursday meeting. |
The test format for Casper fork choice rule is up for discussion here
As a team, we should decide if we like the format and come up with a comprehensive design doc. It's also important we study what other client teams are doing for synchrony. (Ex: ethereum/beacon_chain#58)
This issue is associated with #255
More on fork choice rule here: https://ethresear.ch/t/immediate-message-driven-ghost-as-ffg-fork-choice-rule/2561
The text was updated successfully, but these errors were encountered: