[oprf][shuffle] OPRF Shuffle using a 2-round 4-message shuffle protocol #816

cryo28 · 2023-10-25T17:15:37Z

Implementation of the protocol (not sharded)
New query type
Record structs and implt for the new query type row: probably not needed or should be changed. I had to implement them without knowing what they should actually be, in order to implement a new query type.

1. New query type 2. Implementation of the protocol (not sharded)

martinthomson · 2023-10-26T01:31:37Z

src/protocol/oprf/shuffle/mod.rs

@@ -0,0 +1,403 @@
+use std::ops::{Add, AddAssign};


I think that Daniel pointed this out, but this has nothing to do with the OPRF. It should replace our existing shuffle.

I agree that this should be the shuffle that we use in both approaches. Alex and I discussed this. Alex seemed to prefer that we still put it in an OPRF related module even when it is a basic protocol (this is also the case for operations on Boolean shares). The main reason is that we have a more clear separation between old and new components in case we want to migrate away from the old IPA.

I actually want to add to this that sorting also requires an unshuffle operation that undoes a shuffle. In the new IPA approach, we dont need that. This makes everything simpler since we do not need to store the permutation explicitly in memory and then apply it (we can just run a seeded shuffling of a vector). Nevertheless if we want to have compatibility of the new shuffle with the old IPA, more specifically radix sort, we would need to implement a less efficient version of the shuffle.

I don't think shuffle implementations are at the state where we can switch between them. The new one requires special input format (moving from struct { secret_share1, secret_share2, ... } to secret_share { struct1, struct2 }) while the old one requires them to be reshareable only. It would be ideal if shuffle is generalized over some sort of a trait, but I don't think it is atm.

So we may want to keep OPRF-specific things inside a different module

This isn't OPRF-specific, it's just a different shuffle. It's OK if we have two shuffles for now, but this shouldn't have to be bundled in with the OPRF.

danielmasny

Awesome, thanks! Looks good to me (ignoring the query parts). I haven't checked any of the shuffling parts except that you are using sharedvalue. Is there a way to see only your changes with respect to my previous review?

danielmasny · 2023-10-26T20:54:06Z

src/protocol/sort/apply_sort/mod.rs

@@ -5,12 +5,9 @@ pub use shuffle::shuffle_shares;
 use crate::{
    error::Error,
    protocol::{
-        basics::Reshare,
+        basics::{apply_permutation::apply_inv, Reshare},
        context::Context,


I don't think we need to separate apply_permutation from sort. As Alex pointed out, we can shuffle the data directly using the random seed. If we wanted to have compatibility with the old IPA including the sorting, we would need to use apply_permutation in the new shuffle such that we can undo the shuffle

we could potentially leave it if we want to make the new shuffle compatible with radix sort at some point

I agree - apply_permutation is much less efficient than shuffling directly - #817

danielmasny · 2023-10-26T20:57:21Z

src/protocol/oprf/shuffle/mod.rs

@@ -0,0 +1,403 @@
+use std::ops::{Add, AddAssign};


I actually want to add to this that sorting also requires an unshuffle operation that undoes a shuffle. In the new IPA approach, we dont need that. This makes everything simpler since we do not need to store the permutation explicitly in memory and then apply it (we can just run a seeded shuffling of a vector). Nevertheless if we want to have compatibility of the new shuffle with the old IPA, more specifically radix sort, we would need to implement a less efficient version of the shuffle.

danielmasny · 2023-10-26T21:05:11Z

src/query/runner/oprf_shuffle/share.rs

+    }
+}
+
+impl SharedValue for ShuffleShare {


src/protocol/oprf/shuffle/mod.rs

danielmasny · 2023-10-26T21:16:56Z

src/query/runner/oprf_shuffle/share.rs

+
+impl ShuffleShare {
+    #[must_use]
+    pub fn from_input_row(input_row: &ShuffleInputRow, shared_with: Direction) -> Self {


I am not sure about this, could you explain me why we need these functions?

+1, lets just inline them

akoshelev · 2023-10-29T06:22:51Z

src/one_off_fns.rs

+/// streams.
+///
+/// <https://github.com/rust-lang/rust/issues/102211#issuecomment-1367900125>
+pub fn assert_stream_send<'a, T>(


👍 . would be even better if we put it inside lib.rs

akoshelev · 2023-10-29T06:25:42Z

src/query/executor.rs

+            gateway,
+            input,
+            move |prss, gateway, config, input| {
+                let ctx = BaseContext::new(prss, gateway);


you should construct semi honest context here - base is for internal use

akoshelev · 2023-10-29T06:29:53Z

src/protocol/sort/apply_sort/mod.rs

@@ -5,12 +5,9 @@ pub use shuffle::shuffle_shares;
 use crate::{
    error::Error,
    protocol::{
-        basics::Reshare,
+        basics::{apply_permutation::apply_inv, Reshare},
        context::Context,


I agree - apply_permutation is much less efficient than shuffling directly - #817

akoshelev · 2023-10-29T06:31:11Z

src/query/runner/oprf_shuffle/query.rs

+        Self { _config: config }
+    }
+
+    #[tracing::instrument("ipa_query", skip_all, fields(sz=%query_size))]


Suggested change

#[tracing::instrument("ipa_query", skip_all, fields(sz=%query_size))]

#[tracing::instrument("shuffle_query", skip_all, fields(sz=%query_size))]

akoshelev · 2023-10-29T06:41:20Z

src/query/runner/oprf_shuffle/query.rs

+    pub async fn execute<'a, C: Context + Send>(
+        self,
+        ctx: C,
+        query_size: QuerySize,


you need to use this parameter to truncate the inputs -

ipa/src/query/runner/ipa.rs

Line 78 in 5041bd6

pub async fn execute<'a>(

akoshelev · 2023-10-29T07:10:17Z

src/protocol/oprf/shuffle/mod.rs

+
+// ------------------ Pseudorandom permutations functions -------------------- //
+
+fn apply_permutation<R: Rng, S>(rng: &mut R, items: &mut [S]) {


I see no reason for having this function

akoshelev · 2023-10-29T07:12:19Z

src/protocol/oprf/shuffle/mod.rs

+    l.into_iter().zip(r).map(|(a, b)| a + b)
+}
+
+fn add_single_shares_in_place<S, R>(mut items: Vec<S>, r: R) -> Vec<S>


this implies () as output of this function

akoshelev · 2023-10-29T07:13:15Z

src/protocol/oprf/shuffle/mod.rs

+where
+    C: Context,
+    I: IntoIterator<Item = AdditiveShare<S>>,
+    S: SharedValue + Add<Output = S> + Message,


Suggested change

S: SharedValue + Add<Output = S> + Message,

S: SharedValue + Add<Output = S>,

akoshelev · 2023-10-29T07:15:47Z

src/protocol/oprf/shuffle/mod.rs

+    ctx: &C,
+    step: &OPRFShuffleStep,
+    direction: Direction,
+    items: Vec<S>,


this contract causes extra clones of the size N on the caller's side. It is possible to avoid by taking an iterator of S: SharedValue.

akoshelev · 2023-10-29T07:19:57Z

src/protocol/oprf/shuffle/mod.rs

+            Direction::Left,
+            c_hat_2.clone(),
+        ),
+        receive_from_peer(


you should receive this to y_3 allocation.

1. Moved assert_stream_send into lib.rs 2. Used SemiHonest context in oprf_shuffle/query.rs 3. tracing instrumentation "shuffle_query" in oprf_shuffle query runner 4. Truncate input to batch_size 5. Got rid of apply_permutation function. Replaced it with inline calls to Vec::shuffle 6. add_single_shares_in_place does not take ownership of the first argument and does not return a new Vec 7. Removed unnecessary Message trait bound on run_h2 8. Manually preallocating memory for receives in order to enable reusing of preallocated memory 9. reuse y_3 allocation to receive c_hat_1 in run_h3 10. reuse x_3 to receive c_hat_2 in run_h1 11. reuse y_1 too add compute y_2 12. Inline from_input_row/to_inpput_row functions

1. Reduced the number of allocated "tables" in H1 to 1 2. Reduced the number of allocated "tables in H2 to 3 This was possible by passing a references to items to send_to_peer function and adding S: Copy trait bound to it

1. Renamed "receive_from_peer" function to "receive_from_peer_into" to denote that it receive data into a given buffer 2. Updated signatures of "send_to_peer" and "receive_from_peer_into" functions to accept buffer of data and batch_size arguments first 3. Introduced "repurpose_allocation" function to distinguish the place where I want to preserve the data for subsequent additions from where I want to "clear" it.

* Running them under randomized executor is not helpful as there is no concurrency/parallelism there. We should write a prop test instead * Add on references can be generic for all GF

akoshelev

I think this is very close. I pushed some polishing and added a TODO - lmk if they don't make sense to you

akoshelev · 2023-11-01T18:03:19Z

src/protocol/oprf/shuffle/mod.rs

+    add_single_shares_in_place(&mut x_3, z_23);
+    x_3.shuffle(&mut rng_perm_r);
+
+    let mut c_hat_1 = repurpose_allocation(y_1);


This may be cheap but it has a non-trivial cost of zeroing out memory. I don't see a need to do that, you can just replace elements in that vector from the iterator.

akoshelev · 2023-11-01T18:07:08Z

src/protocol/oprf/shuffle/mod.rs

+) -> Result<(), Error>
+where
+    C: Context,
+    S: Copy + Message,


please use SharedValue, Message is not an appropriate abstraction for it

1. Used SharedValue instead of Message on trait bounds

Rsolved conflicts in: src/helpers/transport/query/mod.rs src/net/http_serde.rs src/query/executor.rs src/query/runner/mod.rs

akoshelev

lgtm, I think we can address other things in a separate PR

It will be covered by oprf_ipa query

[oprf][shuffle] OPRF Shuffle using a 2-round 4-message shuffle protocol

5041bd6

1. New query type 2. Implementation of the protocol (not sharded)

cryo28 mentioned this pull request Oct 25, 2023

[RFC][oprf][shuffle] OPRF Shuffle using a 2-round 4-message shuffle protocol #809

Closed

cryo28 force-pushed the oprf-shuffle branch from c9a41b8 to 5041bd6 Compare October 25, 2023 21:31

martinthomson reviewed Oct 26, 2023

View reviewed changes

danielmasny reviewed Oct 26, 2023

View reviewed changes

akoshelev reviewed Oct 29, 2023

View reviewed changes

Artem Ignatyev and others added 7 commits October 30, 2023 11:04

[oprf][shuffle] Reused a few more allocations

ba801b8

1. Reduced the number of allocated "tables" in H1 to 1 2. Reduced the number of allocated "tables in H2 to 3 This was possible by passing a references to items to send_to_peer function and adding S: Copy trait bound to it

[oprf][shuffle] Removed a few unneccessary trait bounds

f24c7c9

Take ExactSizeIterator in shuffle

353f132

Clean up clutter from shuffle tests

aea0b77

* Running them under randomized executor is not helpful as there is no concurrency/parallelism there. We should write a prop test instead * Add on references can be generic for all GF

Add a TODO to work with mutable iterators

579e290

akoshelev reviewed Nov 1, 2023

View reviewed changes

Artem Ignatyev added 2 commits November 1, 2023 15:56

[oprf][shuffle] Code review comments

1896b7c

1. Used SharedValue instead of Message on trait bounds

Merge remote-tracking branch 'origin/main' into oprf-shuffle

b28454b

Rsolved conflicts in: src/helpers/transport/query/mod.rs src/net/http_serde.rs src/query/executor.rs src/query/runner/mod.rs

akoshelev approved these changes Nov 2, 2023

View reviewed changes

akoshelev added 4 commits November 7, 2023 09:54

Merge from main

b81a057

Recombine changes with OPRF

32a6a7e

Remove OPRF shuffle

88fa5e2

It will be covered by oprf_ipa query

Merge from main

5bccfcb

akoshelev merged commit 650cb4b into private-attribution:main Nov 7, 2023
4 checks passed

andyleiserson mentioned this pull request Oct 21, 2024

Implement resharding and sharded shuffle based on it. #1014

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[oprf][shuffle] OPRF Shuffle using a 2-round 4-message shuffle protocol #816

[oprf][shuffle] OPRF Shuffle using a 2-round 4-message shuffle protocol #816

cryo28 commented Oct 25, 2023

martinthomson Oct 26, 2023

danielmasny Oct 26, 2023

danielmasny Oct 26, 2023

akoshelev Oct 28, 2023

martinthomson Oct 29, 2023

danielmasny left a comment

danielmasny Oct 26, 2023

danielmasny Oct 26, 2023

akoshelev Oct 29, 2023

danielmasny Oct 26, 2023

danielmasny Oct 26, 2023

danielmasny Oct 26, 2023

akoshelev Oct 29, 2023

akoshelev Oct 29, 2023

akoshelev Oct 29, 2023

akoshelev Oct 29, 2023

akoshelev Oct 29, 2023

akoshelev Oct 29, 2023

akoshelev Oct 29, 2023

akoshelev Oct 29, 2023

akoshelev Oct 29, 2023

akoshelev Oct 29, 2023

akoshelev Oct 29, 2023

akoshelev left a comment

akoshelev Nov 1, 2023

akoshelev Nov 1, 2023

akoshelev left a comment

	#[tracing::instrument("ipa_query", skip_all, fields(sz=%query_size))]
	#[tracing::instrument("shuffle_query", skip_all, fields(sz=%query_size))]


		// ------------------ Pseudorandom permutations functions -------------------- //

		fn apply_permutation<R: Rng, S>(rng: &mut R, items: &mut [S]) {

	S: SharedValue + Add<Output = S> + Message,
	S: SharedValue + Add<Output = S>,

[oprf][shuffle] OPRF Shuffle using a 2-round 4-message shuffle protocol #816

[oprf][shuffle] OPRF Shuffle using a 2-round 4-message shuffle protocol #816

Conversation

cryo28 commented Oct 25, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danielmasny left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

akoshelev left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

akoshelev left a comment

Choose a reason for hiding this comment