Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VerifyingKey Serialisation #661

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

nalinbhardwaj
Copy link
Contributor

@nalinbhardwaj nalinbhardwaj commented Sep 16, 2022

Closes #449 by storing selectors (the input to compress_selectors) as part of the serialised vkey. On reads, reruns the selector compression on the cs object. This also partly completes #643 (by creating the "post' version as the default, afaict no place needs serialisation for the pre version in the current codebase).

Tested using the circuits from https://github.com/nalinbhardwaj/zordle and https://github.com/icemelon/halo2-examples/tree/master/src/fibonacci. This is also extremely useful for faster wasm verifiers!

@nalinbhardwaj nalinbhardwaj force-pushed the vkey-serial branch 2 times, most recently from 550b680 to 7f696f7 Compare October 9, 2022 22:22
jonathanpwang pushed a commit to axiom-crypto/halo2 that referenced this pull request Oct 12, 2022
zcash#661 that allows for vkey
serialization/deserialization and fixes the previous selector
optimization issue
CPerezz pushed a commit to privacy-scaling-explorations/halo2 that referenced this pull request Dec 2, 2022
* VerifyingKey Serialization: merge Nalin's PR
zcash#661 that allows for vkey
serialization/deserialization and fixes the previous selector
optimization issue

* feat: add serialization & deserialization of pkey and vkey using serde
derive

* fix: add `num_fixed_commitments` to vkey serialization so correct number
of fixed columns, post selector compression, are read

* fix: serialize/deserialize pkey directly to/from file

* Update: implement suggestions for PR

* add `plonk::Error::Serde` to pass through serialization errors

* update: remove serialization from evaluation as we now recalculate directly

* fix: clippy warnings

* feat: implement `ProvingKey` serialization without using external crates
`serde` or `bincode`

* examples: add `serialization` example to test `ProvingKey` read and
write on "simple-example"

* feat: added `to/from_bytes` for `ProvingKey` and `VerifyingKey`

* add `pack`/`unpack` helper functions between `&[bool]` and `u8`
* made serialization example use smaller example of standard plonk for
  less code bloat

Co-authored-by: Nalin Bhardwaj <nalinbhardwaj@nibnalin.me>
jonathanpwang added a commit to axiom-crypto/halo2 that referenced this pull request Feb 10, 2023
* VerifyingKey Serialization: merge Nalin's PR
zcash#661 that allows for vkey
serialization/deserialization and fixes the previous selector
optimization issue

* fix: add `num_fixed_commitments` to vkey serialization so correct number
of fixed columns, post selector compression, are read

* fix: serialize/deserialize pkey directly to/from file

* bn256: make `Fq2`, `Fq6`, `Fq12` public

* fq12: fix comments to clarify computation of FQ12 coefficients and use
of Montgomery form

* update: remove serialization from evaluation as we now recalculate directly

* feat: add trait `Hash` to `ff::Field`

* feat: `region.assign_advice` now returns reference `&Assigned<F>` via
`AssignedCell`

* big change: in `prover::WitnessCollection` now use `Arc` to manage smart
pointer to assigned advice value.

* otherwise cannot pass a reference out of `assign_advice` because the
  lifetime of immutable borrow ends up forcing mutable borrow of
constraint system to live too long

* chore: restructure workspace to match halo2-ce monorepo

* Squashed 'primitives/poseidon/' content from commit 5d29df01

git-subtree-dir: primitives/poseidon
git-subtree-split: 5d29df01a95e3df6334080d28e983407f56b5da3

* primitives/poseidon: initial add from PSE repo

* feat: add conversion functions to `ff:PrimeField` to go between `u64`
limbs or `BigInt`

* feat: add `is_zero_vartime` derived implementation for `Fr`

* change `assign_advice` API to take `Value<Assigned<F>>` instead of
  `Value<F>`

* Fix `secp256k1` compressed serialization

* update: remove `region` from `Cell` since we only use one region

* update: remove `?` from `WitnessCollection::assign_advice` due to
performance issues
* See comments for (more) unsafe code version that gets another 3-4%
  performance boost.

* feat: add back `copy_advice` function for `AssignedCell<&Assigned<F>,
F>`

* feat: implement functions `row_offset` and `column` directly for
`AssignedCell` (previously for `Cell`)

* Add serde test for curves

* feat: add `CurveAffineExt::into_coordinates` for raw unchecked affine coordinates of curve point

* update: modify `assign_fixed` slightly for performance

* feat: implement `ProvingKey` serialization without using external crates
`serde` or `bincode`

* examples: add `serialization` example to test `ProvingKey` read and
write on "simple-example"

* curves: switch to using rust nightly "bigint_helper_methods" for finite
field implementations

* further optimize finite field implementations by following gnark

* also improve bigint conversion functions to limbs

* feat: implement `From<$field>` for `[u64; 4]` so field elements can be
converted to their little endian representation with `u64` limbs

* feat: add `From<[u64;4]>` and `To<[u64;4]>` to field implementations

* feat: add default implementation of `CurveAffineExt` for `CurveAffine`

* feat: added `to/from_bytes` for `ProvingKey` and `VerifyingKey`

* add `pack`/`unpack` helper functions between `&[bool]` and `u8`
* made serialization example use smaller example of standard plonk for
  less code bloat

* fix: get multi-phase constraint system to work with our version of
`assign_advice`

* exposed inner `Phase(u8)` to crate

* feat: change `region.assign_advice` to allow any `Into<Assigned<F>>` but
still always return `&Assigned<F>`

* feat: change `layouter.assign_region` to take `FnOnce` closure instead
of `FnMut` now that there is no "get shape" mode

* feat: derive `Serialize`, `Deserialize` for common fields and curves

* chore: derive `PartialEq, Hash` for `bn256::{Fq,Fr}` and
`secp256k1::{Fp,Fq}`
* I see no longer to use `ct_eq` direct implementation of `PartialEq`
* deriving `Hash` for all fields because it may be useful and doesn't
  hurt

* feat: add trait `SerdeObject` for serialization of objects into raw
bytes

* implement `SerdeObject` for all macro-derived prime fields (raw bytes
  means staying in Montgomery form)
* implement `SerdeObject` for `Fq2` directly
* implement `SerdeObject` via macro for curves `bn254` and `secp256k1`
* add tests `test_serialization` for fields and curves

* chore: remote direct `PartialEq` implementation in bn254/assembly

* feat: read `VerifyingKey` and `ProvingKey` does not require `params` as
long as we serialize `params.k()`

* fix: add impl `SerdeObject` to assembly macro

* feat: derive `Hash` for `Fq2, G1Affine, G2Affine` for future use

* fix: change `assert` to `debug_assert` in `read_raw_unchecked` functions

* feat: add `next_phase` and `get_challenge` functions to `Region` so that
a circuit can move onto the next phase during a single call of
`synthesize`
* currently only supports `create_proof` on 1 circuit at a time;
  otherwise it is not compatible with the original API which does
synthesize for all circuits in a single phase before moving onto the
next

* update: change `is_less_than` to use `borrowing_sub` because it is
faster
see
privacy-scaling-explorations/halo2curves#10 (comment)
for context

* optimize: revert to old way of subtracting field elements without
branching

* see jonathanpwang/halo2curves@ea9af0d

* feat: parallelize (cpu) shplonk prover

* shplonk: improve `construct_intermediate_sets` using `BTreeSet` and
`BTreeMap` more aggressively

* shplonk: change `construct_intermediate_sets` to use `FxHashSet` instead
of `BTreeSet`

* shplonk: add `Send` and `Sync` to `Query` trait for more parallelization

* chore: in derive `field`, default `mul` is `const` again

* - Implements `PartialOrd` for `Value<F>`
- Adds a `transpose` method to turn `Value<Result<_>>` into
  `Result<Value<_>>`
- `Expression::identifier()` remove string memory reallocation

* chore: switch to halo2curves 0.3.1 tag

* Fix MockProver `assert_verify` panic errors (privacy-scaling-explorations#118)

* fix: Support dynamic lookups in `MockProver::assert_verify`

Since lookups can only be `Fixed` in Halo2-upstream, we need to add
custom suport for the error rendering of dynamic lookups which doesn't
come by default when we rebase to upstream.

This means that now we have to print not only `AdviceQuery` results to
render the `Expression` that is being looked up. But also support
`Instance`, `Advice`, `Challenge` or any other expression types that are
avaliable.

This addresses the rendering issue, renaming also the `table_columns`
variable for `lookup_columns` as the columns do not have the type
`TableColumn` by default as opposite to what happens upstream.

* fix: Don't error and emit empty String for Empty queries

* feat: Add `assert_sarisfied_par` fn to `MockProver`

* fix: Address clippy errors

Resolves: privacy-scaling-explorations#116

* Remove partial ordering for value

* Remove transpose

* Parallelize SHPLONK multi-open prover (privacy-scaling-explorations#114)

* feat: parallelize (cpu) shplonk prover

* shplonk: improve `construct_intermediate_sets` using `BTreeSet` and
`BTreeMap` more aggressively

* shplonk: add `Send` and `Sync` to `Query` trait for more parallelization

* fix: ensure the order of the collection of rotation sets is independent
of the values of the opening points

Co-authored-by: Jonathan Wang <jonathanpwang@users.noreply.github.com>

* fix: FailureLocation::find empty-region handling (privacy-scaling-explorations#121)

After working on fixing
privacy-scaling-explorations/zkevm-circuits#1024, a bug was found in the
verification fn of the MockProver which implies that while finding a
FailureLocation, if a Region doesn't contain any rows.

This is fixed by introducing a 2-line solution suggested by @lispc.

Resolves: privacy-scaling-explorations#117

* feat: add enum `SerdeFormat` for user to select
serialization/deserialization format of curve and field elements

* fix: revert use of raw pointer in `MockProver` and switch to using `Arc`

* feat: add docs

---------

Co-authored-by: kilic <kiliconu@itu.edu.tr>
Co-authored-by: NoCtrlZ <phantomofrotten@gmail.com>
Co-authored-by: Brechtpd <Brechtp.Devos@gmail.com>
Co-authored-by: David Nevado <david@davidnevado.xyz>
Co-authored-by: han0110 <tinghan0110@gmail.com>
Co-authored-by: Nalin Bhardwaj <nalinbhardwaj@nibnalin.me>
Co-authored-by: Jonathan Wang <jonathanpwang@users.noreply.github.com>
Co-authored-by: adria0 <nowhere@>
Co-authored-by: Carlos Pérez <37264926+CPerezz@users.noreply.github.com>
Co-authored-by: adria0.eth <5526331+adria0@users.noreply.github.com>
@parazyd
Copy link
Contributor

parazyd commented Mar 3, 2023

Are there any plans to merge this?

parazyd added a commit to parazyd/halo2 that referenced this pull request Mar 4, 2023
@therealyingtong therealyingtong added the F-nightly Issue or PR about a nightly feature label Mar 7, 2023
halo2_proofs/src/plonk.rs Outdated Show resolved Hide resolved
}

impl<C: CurveAffine> VerifyingKey<C> {
/// Writes a verifying key to a buffer.
pub fn write<W: io::Write>(&self, writer: &mut W) -> io::Result<()> {
Copy link
Contributor

@daira daira Apr 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please write an 0x01 version byte first. On read, error if this byte is not 0x01. [Done.]

(Why 0x01 rather than 0x00? Because forks that have already merged some similar patch, such as privacy-scaling-explorations#103 , will have written the number of fixed commitments as big-endian, and so they will have written the first byte as 0x00 for any number of fixed commitments < $2^{24}$, i.e. all practical circuits. This allows descendants of those forks to be able to distinguish the two formats, and we will correctly reject the incompatible format early.)

@@ -47,14 +48,82 @@ pub struct VerifyingKey<C: CurveAffine> {
cs_degree: usize,
/// The representative of this `VerifyingKey` in transcripts.
transcript_repr: C::Scalar,
selectors: Vec<Vec<bool>>,
Copy link
Contributor

@daira daira Apr 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial intuition was that this should be the compressed selectors, because including the uncompressed selectors makes the representation dependent on the selector compression algorithm when it needn't be.

On the other hand, because the compressed selector columns use full field elements, their representation ends up being much larger, despite the "compression". So the approach used by this PR is also a reasonable option.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that I said the same thing in #449 (comment) . I'm nothing if not consistent, even though I'd completely forgotten making that comment!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you. In retrospect serialization should definitely use the compressed selectors. Ideally VerifyingKey::read should not need to know the type of ConcreteCircuit or even be supplied with params (the latter is only used to generate domain, cs and get k, n I believe).

Right now I have to create a ConcreteCircuit ahead of time and then trick Rust to do some type inference to call read.

halo2_proofs/src/plonk.rs Outdated Show resolved Hide resolved
halo2_proofs/src/plonk.rs Outdated Show resolved Hide resolved
halo2_proofs/src/plonk.rs Outdated Show resolved Hide resolved
halo2_proofs/src/plonk.rs Outdated Show resolved Hide resolved
halo2_proofs/src/plonk.rs Outdated Show resolved Hide resolved
halo2_proofs/src/plonk.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@daira daira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least the change to little-endian for the length of fixed commitments is blocking. [Done.]

nalinbhardwaj and others added 2 commits May 1, 2023 10:48
Signed-off-by: Daira Emma Hopwood <daira@jacaranda.org>
of fixed columns, post selector compression, are read

Signed-off-by: Daira Emma Hopwood <daira@jacaranda.org>
daira
daira previously approved these changes May 1, 2023
Copy link
Contributor

@daira daira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed; rebased to fix conflicts and add my changes in the last commit. This will need tests.

* write an initial 0x01 version byte, and check it on read;
* use little-endian byte order for lengths;
* add length fields before the selectors and the permutation commitments,
  and check their consistency with the expected lengths on read;
* do not write an extra byte for a selector bit vector if the number of
  bits is a multiple of 8;
* ensure that original I/O errors are preserved.

Signed-off-by: Daira Emma Hopwood <daira@jacaranda.org>
* add `pack`/`unpack` helper functions between `&[bool]` and `u8`.

Extracted from privacy-scaling-explorations@2c7f288

Co-authored-by: Carlos Pérez <37264926+CPerezz@users.noreply.github.com>
Signed-off-by: Daira Emma Hopwood <daira@jacaranda.org>
Copy link

@jonathanpwang jonathanpwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

I leave it up to you whether you want to make the change to compressed selectors before merging.

@@ -47,14 +48,82 @@ pub struct VerifyingKey<C: CurveAffine> {
cs_degree: usize,
/// The representative of this `VerifyingKey` in transcripts.
transcript_repr: C::Scalar,
selectors: Vec<Vec<bool>>,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you. In retrospect serialization should definitely use the compressed selectors. Ideally VerifyingKey::read should not need to know the type of ConcreteCircuit or even be supplied with params (the latter is only used to generate domain, cs and get k, n I believe).

Right now I have to create a ConcreteCircuit ahead of time and then trick Rust to do some type inference to call read.

@str4d
Copy link
Contributor

str4d commented May 2, 2023

We discussed this PR in Halo 2 Office Hours today, and decided that we will not merge it as-is. The core issue is that the serialization format being defined here is tightly coupled to the current implementation, and will be very difficult to retain support for after the planned optimisation / FloorPlanner refactorings occur.

For example, during keygen we extend the set of fixed column commitments to include commitments to the post-selector-combining selector columns:

let (cs, selector_polys) = cs.compress_selectors(assembly.selectors);
fixed.extend(
selector_polys
.into_iter()
.map(|poly| domain.lagrange_from_vec(poly)),
);

Then in VerifyingKey::read we read the post-selector-combining fixed commitments, read the pre-selector-combining selector definitions, re-compress the selector columns, and throw away the resulting selector polynomials (without checking that the effect of the compression on the constraint system's gates was correct or consistent with the parsed fixed commitments):

let fixed_commitments: Vec<_> = (0..num_fixed_columns)
.map(|_| C::read(reader))
.collect::<io::Result<_>>()?;

let (cs, _) = cs.compress_selectors(selectors.clone());

So as-is, the implicit serialization format is cutting half-way through the selector-combining optimisation, instead of picking a side. This specific issue could for instance be averted by not serializing the fixed commitments for the selectors, and always re-deriving them afterwards. Alternatively we could re-derive them and check they match, but then we're doing the same computational work during parsing with no benefit to encoding size. Solving this specific issue is a blocker on this PR (and it would be good to check for any other similar specific issues).

But if we did make that fix, we are then in a situation where the implicit serialization format is still cutting part-way through the logic that we want the FloorPlanner to be responsible for. Maintaining compatibility with this format would place strong constraints on how we designed the new FloorPlanner APIs, and probably require special-casing the selector-combining optimisation near-indefinitely (rather than it becoming an implementation detail of specific floor planners).

The most productive route forward would I think be to work on creating a post-optimisation VerifyingKey struct (the second half of #643), and then defining an encoding format for it. This can be done without blocking on the FloorPlanner refactoring, as it's about defining the type that would exist on the output side of the FloorPlanner rather than anything within its boundaries. It would require defining canonical encodings for polynomials as well as the polynomial ASTs we use to represent gates; we probably want to look at https://crates.io/crates/postcard for actually producing the encoding format.

@jonathanpwang
Copy link

I'm curious - how do you feel about just using #[derive(Serialize, Deserialize)]? That would avoid all of this. I have yet to benchmark but I would guess the bincode serialization/deserialization will be faster than what's currently implemented. This would involve the PrimeField to also implement Serialize, Deserialize.

@str4d
Copy link
Contributor

str4d commented May 3, 2023

I'm curious - how do you feel about just using #[derive(Serialize, Deserialize)]? That would avoid all of this.

That would not solve the above problem, and in fact would make the implicit encoding even more tightly coupled to the current implementation.

Once the types issue is addressed however, and a VerifyingKey type exists that contains precisely what should be encoded in precisely the internal form required for circuit correctness, then that could be an approach taken (using postcard as I suggested above).

@str4d
Copy link
Contributor

str4d commented May 23, 2023

The most productive route forward would I think be to work on creating a post-optimisation VerifyingKey struct (the second half of #643), and then defining an encoding format for it.

I'm going to try working on this some time next week.

@str4d
Copy link
Contributor

str4d commented Jun 6, 2023

The most productive route forward would I think be to work on creating a post-optimisation VerifyingKey struct (the second half of #643), and then defining an encoding format for it.

I'm going to try working on this some time next week.

We did the first phase of this in Halo 2 Office Hours today. The resulting exploratory changes are in #781, along with the conclusions we drew from them about how the real refactor should work.

parazyd added a commit to parazyd/halo2 that referenced this pull request Jun 29, 2023
Signed-off-by: parazyd <parazyd@dyne.org>
@parazyd
Copy link
Contributor

parazyd commented Jul 15, 2023

@str4d For retaining compatibility and more comfortable API usage for users which do not use serde, why not define the encoding and decoding format explicitly in the halo2 crate?

It seems like it would give you finer-grained control over the format, and it is also low overhead for people that use different serialization methods (i.e. they only have to wrap the defined halo2 encoding format in whatever trait their serialization library needs).

@hanzhichao
Copy link

any update about the provingkey and verifyingkey serialization?

ThrasherLT added a commit to ThrasherLT/halo2 that referenced this pull request Aug 20, 2024
Here's the original PR by nalinbhardwaj:
zcash#661

It was decided not to merge this PR, but at the moment this is the only way to serialize Halo2 VerifyingKey, since PSE's fork of Halo2 only supports serializing keys derived for KZG based proofs which were incompatible with the Poseidon gadget.

This branch should not be merged and is simple used as a temporary solution until halo2 implements proper serialization.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
F-nightly Issue or PR about a nightly feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Define a stable serialization format for VerifyingKey
8 participants