Skip to content

Conversation

@LDiazN
Copy link
Contributor

@LDiazN LDiazN commented Oct 7, 2025

This PR adds support for the anonymous credentials protocol to the Ooni API:

  1. Adds ooniauth-py as a dependency
  2. Adds new endpoints: /sign_credential, /manifest, /submit

Note that the new measurement /submit endpoint won't replace the old one, and for now it's meant to be used mostly during development

Also updates the fastpath so that it's able to work with the new fields

Migrations

This PR will require some database migrations

Postgres

Adds the following tables to postgres:

  • ooniprobe_manifest: Describes the manifest that is reported to users when they register
  • ooniprobe_server_state: Defines the key pair (secret_key, public_parameters) that is used for authentication

You can find the new models here:

class OONIProbeServerState(Base):

And the migrations:
https://github.com/ooni/backend/blob/userauth-dep/ooniapi/common/src/common/alembic/versions/7e28b5d17a7f_add_server_state_table_for_anonymous_.py

Clickhouse

In clickhouse we need to add the fields necessary for:

  • Checking if a measurement is verified
  • Running the verification offline in the fastpath machine at some point in the future when we try to implement offline verification

As an example of the changes, you can look at the clickhouse_init.sql script in fastpath:

`is_verified` Int8,
`nym` Nullable(String),
`zkp_request` Nullable(String),
`age_range` Nullable(String),
`msm_range` Nullable(String),

These are the alter table statements required to run the migration in production:

ALTER TABLE fastpath
ADD COLUMN is_verified Int8,
ADD COLUMN nym Nullable(String),
ADD COLUMN zkp_request Nullable(String),
ADD COLUMN age_range Nullable(String),
ADD COLUMN msm_range Nullable(String);

Feedback

This is still early work to define the API for the anonymous credentials protocol. Some things that could benefit from a bit of feedback are:

  • Naming of the endpoints
    • /sign_credential used to be named /register but it clashes with the older /register one, so I chose a different name
  • Parameters and usage
  • Lifecycle of manifest and server state: Right now we always create a new manifest and server state if none is available to keep the API working, but we might want to be more deliberate with how we work with this

closes #1014 #1015

@LDiazN LDiazN requested a review from hellais October 7, 2025 14:11
@LDiazN LDiazN self-assigned this Oct 7, 2025
@codecov
Copy link

codecov bot commented Oct 10, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.92%. Comparing base (d943f03) to head (5deb275).
⚠️ Report is 11 commits behind head on master.

❌ Your project check has failed because the head coverage (90.92%) is below the target coverage (95.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1016      +/-   ##
==========================================
- Coverage   92.81%   90.92%   -1.90%     
==========================================
  Files          17       28      +11     
  Lines        1281     2932    +1651     
  Branches       65      247     +182     
==========================================
+ Hits         1189     2666    +1477     
- Misses         78      206     +128     
- Partials       14       60      +46     
Flag Coverage Δ
oonimeasurements 87.60% <ø> (?)
ooniprobe ?
oonirun 99.39% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@hellais hellais left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments on small improvements and some questions.

We can discuss on slack anything which is not clear.

measurement = ujson.loads(msm_jstr)
if sorted(measurement.keys()) == ["content", "format"]:

is_verified = g(measurement, 'is_verified', False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if at some point we renamed this function g as something a bit more understandable. It's basically getting the key and if not it's setting to to False. Correct?

I believe modern versions of python have native support for this.

I guess we can take note of it as a future refactoring work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just do that I think, the fastpath now runs in python 3.12 in AWS, this can be replaced with measurement.get('is_verified', False)

We can open another issue for that

def tf(v: bool) -> str:
return "t" if v else "f"

def s_or_n(x : Optional[Any]) -> Optional[str]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call this function a bit more explicitly, instead of these shorthands?

Even better would be if we just use an explicit lambda function inside of the age_range field, eg.

age_range=lambda x: ujson.dumps(x) if x is not None else None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow the lambda part. Where would you call the lambda? is there a place where you can use it as argument to automatically parse the thing?

But in the meanwhile I renamed s_or_n to serialize_optional to have a more descriptive name

compressed_len = len(data)
data = zstd.decompress(data)
ratio = len(data) / len(data)
ratio = compressed_len / len(data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just leave this inlined? I think there is no need to have an extra line and variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

log.debug(f"Zstd compression ratio {ratio}")
except Exception as e:
except Exception:
log.info("Failed zstd decompression")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we log e inside of log.info. Make sure you use {e} inside of log.info(f"") which should perform sanitisation.

Copy link
Contributor Author

@LDiazN LDiazN Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, also did the same thing with other exceptions

# time window
now = datetime.now(timezone.utc)
h = sha512(data_bin).hexdigest()[:16]
ts = now.strftime("%Y%m%d%H%M%S.%f")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use ISO standard? Make sure it has in it T and Z to indicate timezone.

Copy link
Contributor Author

@LDiazN LDiazN Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if that's a good idea because ts is used to compute the measurement id:


So we would be changing the measurement id format. I'm not too confident that this change won't break anything else. What do you think?

f"[Try {t+1}/{N_RETRIES}] Error trying to send measurement to the fastpath ({settings.fastpath_url}). Error: {exc}"
)
sleep_time = random.uniform(0, min(3, 0.3 * 2 ** t))
await asyncio.sleep(sleep_time)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary? cc @mmaker

assert resp.status_code == status.HTTP_200_OK, f"Unexpected status code: {resp.status_code} - {url}. {resp.content}"
return resp.json()

def post(client, url, data, headers=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move these into some kind of utils.py module?

Copy link
Contributor Author

@LDiazN LDiazN Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, also refactored other tests to use the utils.py module

content_encoding: str = Header(default=None),
) -> SubmitMeasurementResponse | Dict[str, Any]:
"""
Submit measurement
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you write a better docstring?

Also, it would be good to know where we plug in the values for "valid" probe_age_ranges and probe_msm_range (can you rename these variables to that)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed the variables and also added extra documentation to explain how these variables are used, but I'm not sure if this is what was expected here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Python bindings as dependency for ooniprobe

3 participants