Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hybrid conversion #1382

Merged
merged 15 commits into from
Nov 4, 2024
Merged

Hybrid conversion #1382

merged 15 commits into from
Nov 4, 2024

Conversation

tyurek
Copy link
Collaborator

@tyurek tyurek commented Oct 28, 2024

This d̶i̶f̶f̶ PR finishes the implementation of Hybrid Conversion reports and wraps them up into general Hybrid reports.
The main major change compared to OprfReports is that I wrote the encryption/decryption functions to take a HybridInfo struct as an argument, since there's a lot more info used for conversion reports now.

I'll say this is ready for review now and add some more coverage tomorrow.

Open question: In query/runner/hybrid.rs, I need to pass around HybridInfo structs to do decryption. I’m not totally sure the best way to structure that, since HybridInfo is currently written as an enum, but you would need both the Impression and Conversion types if you wanted to be able to decrypt both types of reports. I could write it to require both, but I’m not sure what makes the most sense given the overall system design. Currently I just comment out the Impression cases to test basic correctness, but that's just temporary until we figure this out.

Comment on lines 575 to 617
pub fn mk_ciphertext(&self) -> &[u8] {
match self {
EncryptedHybridGeneralReport::Impression(impression_report) => {
impression_report.mk_ciphertext()
}
EncryptedHybridGeneralReport::Conversion(conversion_report) => {
conversion_report.mk_ciphertext()
}
}
}

pub fn encap_key_btt(&self) -> &[u8] {
match self {
EncryptedHybridGeneralReport::Impression(impression_report) => {
impression_report.encap_key_btt()
}
EncryptedHybridGeneralReport::Conversion(conversion_report) => {
conversion_report.encap_key_btt()
}
}
}

pub fn btt_ciphertext(&self) -> &[u8] {
match self {
EncryptedHybridGeneralReport::Impression(impression_report) => {
impression_report.btt_ciphertext()
}
EncryptedHybridGeneralReport::Conversion(conversion_report) => {
conversion_report.btt_ciphertext()
}
}
}

pub fn key_id(&self) -> u8 {
match self {
EncryptedHybridGeneralReport::Impression(impression_report) => {
impression_report.key_id()
}
EncryptedHybridGeneralReport::Conversion(conversion_report) => {
conversion_report.key_id()
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we actually need access to these on the EncryptedHybridGeneralReport? I think we should only need decrypt and from_bytes (which call into the enum items).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw that mk_ciphertext was needed for UniqueBytes. Not sure about the other ones though (or if we might need other ones later?)


/// ## Errors
/// If decryption of the provided oprf report fails.
pub fn decrypt_from_oprf_report_bytes<P, TS>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need to keep this long term. it would be great if we could get everything to work without this, but if we want to split that into another PR, that's fine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I think I'll just remove this once all the oprf is completely disentangled

@eriktaubeneck
Copy link
Member

  1. This spec (especially for conversions) has a lot more information contained in Info structs than the old OPRF implementation. The report struct does not have all of this information so I changed the encrypt/decrypt interface for hybrid reports to take the appropriate Info struct as an argument. Should I do it this way or add the extra fields (conversion_site_domain, timestamp, epsilon, sensitivity) to the report struct?

Most of this should live in the associated data (Info), and we'll need to be able to access it on the EncryptedConversionReport in order to do some filtering and preprocessing. It should be fine to access it through the info field, thought we could also implement getters on the EncryptedConversionReport struct if we wanted.

All these fields do not need to persist into the HybridImpressionReport or HybridConversionReport after decryption.

  1. I left some of the old EncryptedHybridReport struct code and implemented my changes in a new struct EncryptedHybridGeneralReport. All the tests in hybrid.rs work with the new code, but there is other code here: https://github.com/tyurek/ipa/blob/hybrid_conversion/ipa-core/src/query/runner/hybrid.rs#L65 that isn't compatible: It assumes that EncryptedHybridReport can be initialized from the encrypted bytes of an OPRF report. I did write a function to create a (decrypted) HybridReport from encrypted OPRF report bytes, but EncryptedHybridReport doesn't store bytes: it's an enum. So I'm not really sure what do do with this part of the code, but I figured I would get some feedback on the rest before I tear this up too badly

Our overall goal is to remove the reliance on the OPRF report. The high level flow should be

BodyStream → EncryptedHybridReport → HybridReport → IndistinguishableHybridReport

and this diff is primarily concerned with the first two conversions. It's fine to rewrite the existing tests that us OPRF reports to use the new structs that you are creating.

but EncryptedHybridReport doesn't store bytes: it's an enum. So I'm not really sure what do do with this part of the code,

The two underlying structs in that enum do store bytes, so it should be possible to implement TryFrom<Bytes> on EncryptedHybridReport and build the correct sub-struct (either the EncryptedHybridConversionReport or the EncryptedHyrbidImpressionReport.)

Hope that all helps / makes sense.

Copy link

codecov bot commented Oct 31, 2024

Codecov Report

Attention: Patch coverage is 85.59464% with 86 lines in your changes missing coverage. Please review.

Please upload report for BASE (main@6d29275). Learn more about missing BASE report.
Report is 29 commits behind head on main.

Files with missing lines Patch % Lines
ipa-core/src/report/hybrid.rs 80.14% 85 Missing ⚠️
ipa-core/src/report/hybrid_info.rs 98.76% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1382   +/-   ##
=======================================
  Coverage        ?   93.56%           
=======================================
  Files           ?      223           
  Lines           ?    38260           
  Branches        ?        0           
=======================================
  Hits            ?    35797           
  Misses          ?     2463           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

config: HybridQueryParams,
key_registry: Arc<R>,
hybrid_info: HybridInfo<'a>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably missed that on the previous review, but helper origin is assumed to be known statically to helpers. This should help to avoid the lifetime proliferation and the need to keep it inside query struct

If there is a need for having it, we can assume helper origin is &'static str

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The design doc for this project has the helper origin as part of both HybridConversionInfo and HybridImpressionInfo, so I assume it's needed? I can try static though

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it is definitely needed, but it is assumed to be per MPC ring, i.e. known in advance. We hardcoded it in our Rust implementation, but generally I think &static str would be more appropriate

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is conversion_site_domain also static? If so, we can remove the lifetime

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conversion site is dynamic and provided by the submitter when they want to execute this query

ipa-core/src/query/runner/hybrid.rs Outdated Show resolved Hide resolved
ipa-core/src/report/hybrid.rs Show resolved Hide resolved
ipa-core/src/report/hybrid.rs Outdated Show resolved Hide resolved
ipa-core/src/report/hybrid.rs Outdated Show resolved Hide resolved
ipa-core/src/report/hybrid.rs Show resolved Hide resolved
ipa-core/src/report/hybrid.rs Outdated Show resolved Hide resolved
ipa-core/src/report/hybrid.rs Outdated Show resolved Hide resolved
ipa-core/src/report/hybrid.rs Outdated Show resolved Hide resolved
Copy link
Member

@eriktaubeneck eriktaubeneck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this is looking good. I left a couple comments about using TestHybridRecord instead of TestRawDataRecord.

Is the plan to detangle OprfReport out entirely in this PR, or do that in the next PR? I'm OK with either.

@tyurek
Copy link
Collaborator Author

tyurek commented Nov 1, 2024

I'm trying to detangle exactly as much as is needed to get tests to pass, with the goal of doing the rest in the next PR

config: HybridQueryParams,
key_registry: Arc<R>,
hybrid_info: HybridInfo<'a>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conversion site is dynamic and provided by the submitter when they want to execute this query

pub helper_origin: &'a str,
pub converion_site_domain: &'a str,
pub helper_origin: &'static str,
pub conversion_site_domain: &'a str,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can be many conversion sites in publisher queries - we never properly implemented it and I believe just hardcoded both helper origin and conversion site. I think this is fine for now, just wanted to explain it

@tyurek tyurek merged commit dc8f4bb into private-attribution:main Nov 4, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants