-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[oprf][shuffle] OPRF Shuffle using a 2-round 4-message shuffle protocol #816
Conversation
cryo28
commented
Oct 25, 2023
- Implementation of the protocol (not sharded)
- New query type
- Record structs and implt for the new query type row: probably not needed or should be changed. I had to implement them without knowing what they should actually be, in order to implement a new query type.
1. New query type 2. Implementation of the protocol (not sharded)
src/protocol/oprf/shuffle/mod.rs
Outdated
@@ -0,0 +1,403 @@ | |||
use std::ops::{Add, AddAssign}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that Daniel pointed this out, but this has nothing to do with the OPRF. It should replace our existing shuffle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this should be the shuffle that we use in both approaches. Alex and I discussed this. Alex seemed to prefer that we still put it in an OPRF related module even when it is a basic protocol (this is also the case for operations on Boolean shares). The main reason is that we have a more clear separation between old and new components in case we want to migrate away from the old IPA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually want to add to this that sorting also requires an unshuffle operation that undoes a shuffle. In the new IPA approach, we dont need that. This makes everything simpler since we do not need to store the permutation explicitly in memory and then apply it (we can just run a seeded shuffling of a vector). Nevertheless if we want to have compatibility of the new shuffle with the old IPA, more specifically radix sort, we would need to implement a less efficient version of the shuffle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think shuffle implementations are at the state where we can switch between them. The new one requires special input format (moving from struct { secret_share1, secret_share2, ... } to secret_share { struct1, struct2 }) while the old one requires them to be reshareable only. It would be ideal if shuffle is generalized over some sort of a trait, but I don't think it is atm.
So we may want to keep OPRF-specific things inside a different module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't OPRF-specific, it's just a different shuffle. It's OK if we have two shuffles for now, but this shouldn't have to be bundled in with the OPRF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks! Looks good to me (ignoring the query parts). I haven't checked any of the shuffling parts except that you are using sharedvalue. Is there a way to see only your changes with respect to my previous review?
@@ -5,12 +5,9 @@ pub use shuffle::shuffle_shares; | |||
use crate::{ | |||
error::Error, | |||
protocol::{ | |||
basics::Reshare, | |||
basics::{apply_permutation::apply_inv, Reshare}, | |||
context::Context, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to separate apply_permutation from sort. As Alex pointed out, we can shuffle the data directly using the random seed. If we wanted to have compatibility with the old IPA including the sorting, we would need to use apply_permutation in the new shuffle such that we can undo the shuffle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could potentially leave it if we want to make the new shuffle compatible with radix sort at some point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree - apply_permutation
is much less efficient than shuffling directly - #817
src/protocol/oprf/shuffle/mod.rs
Outdated
@@ -0,0 +1,403 @@ | |||
use std::ops::{Add, AddAssign}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually want to add to this that sorting also requires an unshuffle operation that undoes a shuffle. In the new IPA approach, we dont need that. This makes everything simpler since we do not need to store the permutation explicitly in memory and then apply it (we can just run a seeded shuffling of a vector). Nevertheless if we want to have compatibility of the new shuffle with the old IPA, more specifically radix sort, we would need to implement a less efficient version of the shuffle.
} | ||
} | ||
|
||
impl SharedValue for ShuffleShare { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great!
|
||
impl ShuffleShare { | ||
#[must_use] | ||
pub fn from_input_row(input_row: &ShuffleInputRow, shared_with: Direction) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about this, could you explain me why we need these functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, lets just inline them
src/one_off_fns.rs
Outdated
/// streams. | ||
/// | ||
/// <https://github.com/rust-lang/rust/issues/102211#issuecomment-1367900125> | ||
pub fn assert_stream_send<'a, T>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 . would be even better if we put it inside lib.rs
src/query/executor.rs
Outdated
gateway, | ||
input, | ||
move |prss, gateway, config, input| { | ||
let ctx = BaseContext::new(prss, gateway); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should construct semi honest context here - base is for internal use
@@ -5,12 +5,9 @@ pub use shuffle::shuffle_shares; | |||
use crate::{ | |||
error::Error, | |||
protocol::{ | |||
basics::Reshare, | |||
basics::{apply_permutation::apply_inv, Reshare}, | |||
context::Context, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree - apply_permutation
is much less efficient than shuffling directly - #817
Self { _config: config } | ||
} | ||
|
||
#[tracing::instrument("ipa_query", skip_all, fields(sz=%query_size))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#[tracing::instrument("ipa_query", skip_all, fields(sz=%query_size))] | |
#[tracing::instrument("shuffle_query", skip_all, fields(sz=%query_size))] |
pub async fn execute<'a, C: Context + Send>( | ||
self, | ||
ctx: C, | ||
query_size: QuerySize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to use this parameter to truncate the inputs -
Line 78 in 5041bd6
pub async fn execute<'a>( |
src/protocol/oprf/shuffle/mod.rs
Outdated
|
||
// ------------------ Pseudorandom permutations functions -------------------- // | ||
|
||
fn apply_permutation<R: Rng, S>(rng: &mut R, items: &mut [S]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see no reason for having this function
src/protocol/oprf/shuffle/mod.rs
Outdated
l.into_iter().zip(r).map(|(a, b)| a + b) | ||
} | ||
|
||
fn add_single_shares_in_place<S, R>(mut items: Vec<S>, r: R) -> Vec<S> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this implies ()
as output of this function
src/protocol/oprf/shuffle/mod.rs
Outdated
where | ||
C: Context, | ||
I: IntoIterator<Item = AdditiveShare<S>>, | ||
S: SharedValue + Add<Output = S> + Message, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
S: SharedValue + Add<Output = S> + Message, | |
S: SharedValue + Add<Output = S>, |
src/protocol/oprf/shuffle/mod.rs
Outdated
ctx: &C, | ||
step: &OPRFShuffleStep, | ||
direction: Direction, | ||
items: Vec<S>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this contract causes extra clones of the size N
on the caller's side. It is possible to avoid by taking an iterator of S: SharedValue
.
src/protocol/oprf/shuffle/mod.rs
Outdated
Direction::Left, | ||
c_hat_2.clone(), | ||
), | ||
receive_from_peer( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should receive this to y_3
allocation.
1. Moved assert_stream_send into lib.rs 2. Used SemiHonest context in oprf_shuffle/query.rs 3. tracing instrumentation "shuffle_query" in oprf_shuffle query runner 4. Truncate input to batch_size 5. Got rid of apply_permutation function. Replaced it with inline calls to Vec::shuffle 6. add_single_shares_in_place does not take ownership of the first argument and does not return a new Vec 7. Removed unnecessary Message trait bound on run_h2 8. Manually preallocating memory for receives in order to enable reusing of preallocated memory 9. reuse y_3 allocation to receive c_hat_1 in run_h3 10. reuse x_3 to receive c_hat_2 in run_h1 11. reuse y_1 too add compute y_2 12. Inline from_input_row/to_inpput_row functions
1. Reduced the number of allocated "tables" in H1 to 1 2. Reduced the number of allocated "tables in H2 to 3 This was possible by passing a references to items to send_to_peer function and adding S: Copy trait bound to it
1. Renamed "receive_from_peer" function to "receive_from_peer_into" to denote that it receive data into a given buffer 2. Updated signatures of "send_to_peer" and "receive_from_peer_into" functions to accept buffer of data and batch_size arguments first 3. Introduced "repurpose_allocation" function to distinguish the place where I want to preserve the data for subsequent additions from where I want to "clear" it.
* Running them under randomized executor is not helpful as there is no concurrency/parallelism there. We should write a prop test instead * Add on references can be generic for all GF
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is very close. I pushed some polishing and added a TODO - lmk if they don't make sense to you
src/protocol/oprf/shuffle/mod.rs
Outdated
add_single_shares_in_place(&mut x_3, z_23); | ||
x_3.shuffle(&mut rng_perm_r); | ||
|
||
let mut c_hat_1 = repurpose_allocation(y_1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be cheap but it has a non-trivial cost of zeroing out memory. I don't see a need to do that, you can just replace elements in that vector from the iterator.
src/protocol/oprf/shuffle/mod.rs
Outdated
) -> Result<(), Error> | ||
where | ||
C: Context, | ||
S: Copy + Message, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use SharedValue
, Message
is not an appropriate abstraction for it
1. Used SharedValue instead of Message on trait bounds
Rsolved conflicts in: src/helpers/transport/query/mod.rs src/net/http_serde.rs src/query/executor.rs src/query/runner/mod.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, I think we can address other things in a separate PR
It will be covered by oprf_ipa query