Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization improvements #248

Closed
wants to merge 17 commits into from
Closed
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
## Pending

- Serialization refactor [#247]
- CommitSig enum implemented [#246][#247]

## [0.13.0] (2020-04-20)

Dependencies:
Expand Down
12 changes: 2 additions & 10 deletions tendermint/src/abci/proof.rs
Original file line number Diff line number Diff line change
Expand Up @@ -19,18 +19,10 @@ pub struct ProofOp {
#[serde(alias = "type")]
pub field_type: String,
/// Key of the ProofOp
#[serde(
default,
serialize_with = "serializers::serialize_base64",
deserialize_with = "serializers::parse_base64"
)]
#[serde(default, with = "serializers::bytes::base64string")]
pub key: Vec<u8>,
/// Actual data
#[serde(
default,
serialize_with = "serializers::serialize_base64",
deserialize_with = "serializers::parse_base64"
)]
#[serde(default, with = "serializers::bytes::base64string")]
pub data: Vec<u8>,
}

Expand Down
5 changes: 1 addition & 4 deletions tendermint/src/block.rs
Original file line number Diff line number Diff line change
Expand Up @@ -49,10 +49,7 @@ where
#[derive(Deserialize)]
struct TmpCommit {
pub height: Height,
#[serde(
serialize_with = "serializers::serialize_u64",
deserialize_with = "serializers::parse_u64"
)]
#[serde(with = "serializers::primitives::string")]
pub round: u64,
#[serde(deserialize_with = "serializers::parse_non_empty_block_id")]
pub block_id: Option<Id>,
Expand Down
5 changes: 1 addition & 4 deletions tendermint/src/block/commit.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,7 @@ pub struct Commit {
pub height: Height,

/// Round
#[serde(
serialize_with = "serializers::serialize_u64",
deserialize_with = "serializers::parse_u64"
)]
#[serde(with = "serializers::primitives::string")]
pub round: u64,

/// Block ID
Expand Down
214 changes: 156 additions & 58 deletions tendermint/src/block/commit_sig.rs
Original file line number Diff line number Diff line change
@@ -1,78 +1,176 @@
//! CommitSig within Commit

use crate::serializers;
use crate::{account, Signature, Time};
use serde::{de::Error as _, Deserialize, Deserializer, Serialize, Serializer};
use serde::de::{self, Error, MapAccess, Visitor};
use serde::{Deserialize, Deserializer, Serialize, Serializer};
use std::fmt;

/// BlockIDFlag is used to indicate the validator has voted either for nil, a particular BlockID or was absent.
#[derive(Copy, Clone, Debug, PartialEq)]
pub enum BlockIDFlag {
/// BlockIDFlagAbsent - no vote was received from a validator.
BlockIDFlagAbsent = 1,
/// BlockIDFlagCommit - voted for the Commit.BlockID.
BlockIDFlagCommit = 2,
/// BlockIDFlagNil - voted for nil.
BlockIDFlagNil = 3,
/// CommitSig represents a signature of a validator.
/// It's a part of the Commit and can be used to reconstruct the vote set given the validator set.
#[derive(Clone, Debug, PartialEq)]
pub enum CommitSig {
/// no vote was received from a validator.
BlockIDFlagAbsent {
/// Validator address
validator_address: account::Id,
},
/// voted for the Commit.BlockID.
BlockIDFlagCommit {
/// Validator address
validator_address: account::Id,
/// Timestamp of vote
timestamp: Time,
/// Signature of vote
signature: Signature,
},
/// voted for nil.
BlockIDFlagNil {
/// Validator address
validator_address: account::Id,
/// Timestamp of vote
timestamp: Time,
/// Signature of vote
signature: Signature,
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Representing this as an enum is a nice idea 👍

}

impl BlockIDFlag {
/// Deserialize this type from a byte
pub fn from_u8(byte: u8) -> Option<BlockIDFlag> {
match byte {
1 => Some(BlockIDFlag::BlockIDFlagAbsent),
2 => Some(BlockIDFlag::BlockIDFlagCommit),
3 => Some(BlockIDFlag::BlockIDFlagNil),
_ => None,
/// CommitSig implementation
impl CommitSig {
/// Helper: Extract validator address, since it's always present (according to ADR-025)
Copy link
Member

@liamsi liamsi May 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really should clarify this first: #246 (comment)

I can very well be that adr isn't representing the full decision anymore, or, that it is slightly confusing nil and absent votes. Also: #196 (comment)

cc @melekes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created tracking issue #260 about this.

pub fn validator_address(&self) -> &account::Id {
match &self {
CommitSig::BlockIDFlagAbsent { validator_address } => validator_address,
CommitSig::BlockIDFlagCommit {
validator_address, ..
} => validator_address,
CommitSig::BlockIDFlagNil {
validator_address, ..
} => validator_address,
}
}

/// Serialize this type as a byte
pub fn to_u8(self) -> u8 {
self as u8
}

/// Serialize this type as a 32-bit unsigned integer
pub fn to_u32(self) -> u32 {
self as u32
}
}

impl Serialize for BlockIDFlag {
fn serialize<S: Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {
self.to_u8().serialize(serializer)
impl Serialize for CommitSig {
fn serialize<S>(
&self,
_serializer: S,
) -> Result<<S as Serializer>::Ok, <S as Serializer>::Error>
where
S: Serializer,
{
unimplemented!("CommitSig serialization is not implemented yet")
}
}

impl<'de> Deserialize<'de> for BlockIDFlag {
fn deserialize<D: Deserializer<'de>>(deserializer: D) -> Result<Self, D::Error> {
let byte = u8::deserialize(deserializer)?;
BlockIDFlag::from_u8(byte)
.ok_or_else(|| D::Error::custom(format!("invalid block ID flag: {}", byte)))
}
}
impl<'de> Deserialize<'de> for CommitSig {
fn deserialize<D>(deserializer: D) -> Result<Self, <D as Deserializer<'de>>::Error>
where
D: Deserializer<'de>,
{
struct CommitSigVisitor;

/// CommitSig represents a signature of a validator.
/// It's a part of the Commit and can be used to reconstruct the vote set given the validator set.
#[derive(Serialize, Deserialize, Clone, Debug, PartialEq)]
pub struct CommitSig {
/// Block ID FLag
pub block_id_flag: BlockIDFlag,
impl<'de> Visitor<'de> for CommitSigVisitor {
type Value = CommitSig;

/// Validator address
#[serde(deserialize_with = "serializers::parse_non_empty_id")]
pub validator_address: Option<account::Id>,
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
formatter.write_str("CommitSig")
}

/// Timestamp
pub timestamp: Time,
/// Deserialize map Visitor implementation
// Implements decision: https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-025-commit.md#decision
fn visit_map<A>(self, map: A) -> Result<Self::Value, <A as MapAccess<'de>>::Error>
where
A: MapAccess<'de>,
{
// Instead of manually deserializing the whole struct (cumbersome), we use annotations
fn option_signature<'de, D>(deserializer: D) -> Result<Option<Signature>, D::Error>
where
D: Deserializer<'de>,
{
Deserialize::deserialize(deserializer).map(|x: Option<_>| x.unwrap_or(None))
}
#[derive(Deserialize)]
struct CommitSigSpec {
block_id_flag: u8,
validator_address: account::Id,
#[serde(default)]
timestamp: Option<Time>,
#[serde(default, deserialize_with = "option_signature")]
signature: Option<Signature>,
}

/// Signature
#[serde(deserialize_with = "serializers::parse_non_empty_signature")]
pub signature: Option<Signature>,
}
// `MapAccessDeserializer` is a wrapper that turns a `MapAccess`
// into a `Deserializer`, allowing it to be used as the input to T's
// `Deserialize` implementation. T then deserializes itself using
// the entries from the map visitor.
let incoming =
CommitSigSpec::deserialize(de::value::MapAccessDeserializer::new(map))?;

impl CommitSig {
/// Checks if a validator's vote is absent
pub fn is_absent(&self) -> bool {
self.block_id_flag == BlockIDFlag::BlockIDFlagAbsent
// Validate CommitSig (strict)
match incoming.block_id_flag {
// BlockIDFlagAbsent
1 => {
if incoming.timestamp.is_some() {
return Err(A::Error::custom(format!(
"timestamp is present for BlockIDFlagAbsent CommitSig {}",
incoming.timestamp.unwrap()
)));
}
if incoming.signature.is_some() {
return Err(A::Error::custom(format!(
"signature is present for BlockIDFlagAbsent CommitSig {:?}",
incoming.signature.unwrap()
)));
}
Ok(CommitSig::BlockIDFlagAbsent {
validator_address: incoming.validator_address,
})
}
// BlockIDFlagCommit
2 => {
if incoming.timestamp.is_none() {
Err(A::Error::custom(
"timestamp is null for BlockIDFlagCommit CommitSig",
))
} else if incoming.signature.is_none() {
Err(A::Error::custom(
"signature is null for BlockIDFlagCommit CommitSig",
))
} else {
Ok(CommitSig::BlockIDFlagCommit {
validator_address: incoming.validator_address,
timestamp: incoming.timestamp.unwrap(),
signature: incoming.signature.unwrap(),
})
}
}
// BlockIDFlagNil
3 => {
if incoming.timestamp.is_none() {
Err(A::Error::custom(
"timestamp is null for BlockIDFlagNil CommitSig",
))
} else if incoming.signature.is_none() {
Err(A::Error::custom(
"signature is null for BlockIDFlagNil CommitSig",
))
} else {
Ok(CommitSig::BlockIDFlagNil {
validator_address: incoming.validator_address,
timestamp: incoming.timestamp.unwrap(),
signature: incoming.signature.unwrap(),
})
}
}
// Error: unknown CommitSig type
e => Err(A::Error::custom(format!(
"unknown BlockIdFlag value in CommitSig {}",
e
))),
}
}
}

deserializer.deserialize_map(CommitSigVisitor)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the enum idea is cool, it seems to come with the cost of a rather complicated Deserialize impl for commitSig.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and that's the tradeoff @brapse was going to go for in an earlier discussion. (Feel free to engage Sean.) The idea is that serialization is not actually the core product here and during the usual development it's just annoying to do something some particular way just because serialization says so.
This approach puts the logic of making sure that the incoming data is deserialized properly (with some basic validation) and the outgoing data is serialized properly in the serialization "layer" and the internal workings of the application can become completely independent from it. For example it can make use of the additional features of Rust better.
Just to highlight it even more: if/when the Tendermint data model changes because the Go + Rust teams decide on some other implementation, the changes can be implemented in this layer without affecting the application.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to admit there is one not-completely-Rust-native-best-practice feature in the code and it annoys me slightly. I deserialize the incoming CommitSig into an internal CommitSigSpec structure that is defined in ADR-025. I could have gone hard-core and implement complete deserialization while parsing the incoming text. But that would be even more convoluted and harder to change down the road. I feel this is a good compromise that allows us to easily update the deserialization code, if a new ADR comes by.

Copy link
Member

@liamsi liamsi May 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and that's the tradeoff @brapse was going to go for in an earlier discussion. (Feel free to engage Sean.) The idea is that serialization is not actually the core product here and during the usual development it's just annoying to do something some particular way just because serialization says so.

That is absolutely reasonable. Serialization isn't the core product and we shouldn't build our core types around it. All in favour of decoupling serialization concerns and core types (I like this approach for this: #197 (comment) by @tarcieri ).

What I was referring to was that the seemingly beautiful change made the part where the serialization actually happens difficult to read and quite complex. A simpler way might be to have a serialization type and a From / Into for that and the actual type. This is very similar to what you are already doing with the CommitSigSpec type but simpler to parse for a human :-)

Also, it would be cool if the CommitSigSpec would use an enum instead of an u8 for the block_id_flag. Then one wouldn't need to annotate the match arms with meaning (e.g. 1 => {...} with // BlockIDFlagAbsent).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done and done: Today's commit moved things into serde's "from" annotation. So now, the code looks a bit cleaner, although it's not less code:
In the serializers, a new RawCommitSig type was introduced that deserializes from JSON and it looks very much like the definition in the ADR (including the enum for BlockIDFlags). Deserialization automatically invokes basic type-checking of the struct (like an u8 has to be an u8).
The Rust-native CommitSig type converts from this RawCommitSig type through it's "From" trait which holds the validation for the struct. This logic is more about "if this variable is set, than that other variable also has to have a value".

So the deserializer does type-check and the From trait does validation. My only grief is that type-checking happens in the serializers library while validation happens in the struct's library (in this case at CommitSig). Interestingly, this is similar to how it used to be, so I'm guessing the developers will be fine with it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the code looks cleaner that is already worth it 👍 And I think it is OK for the core type to defines when it's valid (s.t. you can only construct a valid type).

}
}
15 changes: 3 additions & 12 deletions tendermint/src/block/header.rs
Original file line number Diff line number Diff line change
Expand Up @@ -45,10 +45,7 @@ pub struct Header {
pub consensus_hash: Hash,

/// State after txs from the previous block
#[serde(
serialize_with = "serializers::serialize_hex",
deserialize_with = "serializers::parse_hex"
)]
#[serde(with = "serializers::bytes::hexstring")]
pub app_hash: Vec<u8>,

/// Root hash of all results from the txs from the previous block
Expand All @@ -70,17 +67,11 @@ pub struct Header {
#[derive(Serialize, Deserialize, Clone, PartialEq, Debug)]
pub struct Version {
/// Block version
#[serde(
serialize_with = "serializers::serialize_u64",
deserialize_with = "serializers::parse_u64"
)]
#[serde(with = "serializers::primitives::string")]
pub block: u64,

/// App version
#[serde(
serialize_with = "serializers::serialize_u64",
deserialize_with = "serializers::parse_u64"
)]
#[serde(with = "serializers::primitives::string")]
pub app: u64,
}

Expand Down
5 changes: 1 addition & 4 deletions tendermint/src/block/parts.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,7 @@ use {
#[derive(Serialize, Deserialize, Clone, Debug, Hash, Eq, PartialEq, PartialOrd, Ord)]
pub struct Header {
/// Number of parts in this block
#[serde(
serialize_with = "serializers::serialize_u64",
deserialize_with = "serializers::parse_u64"
)]
#[serde(with = "serializers::primitives::string")]
pub total: u64,

/// Hash of the parts set header,
Expand Down
10 changes: 2 additions & 8 deletions tendermint/src/block/size.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,16 +9,10 @@ use {
#[derive(Serialize, Deserialize, Clone, Debug, Eq, PartialEq)]
pub struct Size {
/// Maximum number of bytes in a block
#[serde(
serialize_with = "serializers::serialize_u64",
deserialize_with = "serializers::parse_u64"
)]
#[serde(with = "serializers::primitives::string")]
pub max_bytes: u64,

/// Maximum amount of gas which can be spent on a block
#[serde(
serialize_with = "serializers::serialize_i64",
deserialize_with = "serializers::parse_i64"
)]
#[serde(with = "serializers::primitives::string")]
pub max_gas: i64,
}
24 changes: 4 additions & 20 deletions tendermint/src/channel.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,35 +15,19 @@ pub struct Channel {
pub id: Id,

/// Capacity of the send queue
#[serde(
rename = "SendQueueCapacity",
serialize_with = "serializers::serialize_u64",
deserialize_with = "serializers::parse_u64"
)]
#[serde(rename = "SendQueueCapacity", with = "serializers::primitives::string")]
pub send_queue_capacity: u64,

/// Size of the send queue
#[serde(
rename = "SendQueueSize",
serialize_with = "serializers::serialize_u64",
deserialize_with = "serializers::parse_u64"
)]
#[serde(rename = "SendQueueSize", with = "serializers::primitives::string")]
pub send_queue_size: u64,

/// Priority value
#[serde(
rename = "Priority",
serialize_with = "serializers::serialize_u64",
deserialize_with = "serializers::parse_u64"
)]
#[serde(rename = "Priority", with = "serializers::primitives::string")]
pub priority: u64,

/// Amount of data recently sent
#[serde(
rename = "RecentlySent",
serialize_with = "serializers::serialize_u64",
deserialize_with = "serializers::parse_u64"
)]
#[serde(rename = "RecentlySent", with = "serializers::primitives::string")]
pub recently_sent: u64,
}

Expand Down
Loading