Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(torsf): collect tor logs, select rendezvous method, count bytes #683

Merged
merged 11 commits into from
Feb 7, 2022

Conversation

bassosimone
Copy link
Contributor

@bassosimone bassosimone commented Feb 4, 2022

Checklist

Description

This diff contains significant improvements over the previous
implementation of the torsf experiment.

We add support for configuring different rendezvous methods after
the convo at ooni/probe#2004. In doing
that, I've tried to use a terminology that is consistent with the
names being actually used by tor developers.

In terms of what to do next, this diff basically instruments
torsf to always rendezvous using domain fronting. Yet, it's also
possible to change the rendezvous method from the command line,
when using miniooni, which allows to experiment a bit more. In the
same vein, by default we use a persistent tor datadir, but it's
also possible to use a temporary datadir using the cmdline.

Here's how a generic invocation of torsf looks like:

./miniooni -O DisablePersistentDatadir=true \
           -O RendezvousMethod=amp \
           -O DisableProgress=true \
           torsf

(The default is DisablePersistentDatadir=false and
RendezvousMethod=domain_fronting.)

With this implementation, we can start measuring whether snowflake
and tor together can boostrap, which seems the most important thing
to focus on at the beginning. Understanding why the bootstrap most
often does not converge with a temporary datadir on Android devices
remains instead an open problem for now. (I'll also update the
relevant issues or create new issues after commit this.)

We also address some methodology improvements that were proposed
in ooni/probe#1686. Namely:

  • we record the tor version;

  • we include the bootstrap percentage because of the logs;

  • we set the anomaly key correctly;

  • we measure the bytes send and received (by tor).

What remains to be done is the possibility of including Snowflake
events into the measurement, which is not possible until the new
improvements at common/event in snowflake.git are included into a
tagged version of snowflake itself. (I'll make sure to mention
this aspect to @cohosh in ooni/probe#2004.)

This diff contains significant improvements over the previous
implementation of the torsf experiment.

We add support for configuring different rendezvous methods after
the convo at ooni/probe#2004. In doing
that, I've tried to use a terminology that is consistent with the
names being actually used by tor developers.

In terms of what to do next, this diff basically instruments
torsf to always rendezvous using domain fronting. Yet, it's also
possible to change the rendezvous method from the command line,
when using miniooni, which allows to experiment a bit more. In the
same vein, by default we use a persistent tor datadir, but it's
also possible to use a temporary datadir using the cmdline.

Here's how a generic invocation of `torsf` looks like:

```bash
./miniooni -ODisablePersistentDatadir=true \
           -ORendezvousMethod=amp \
           -ODisableProgress=true torsf
```

(The default is `DisablePersistentDatadir=false` and
`RendezvousMethod=domain_fronting`.)

With this implementation, we can start measuring whether snowflake
and tor together can boostrap, which seems the most important thing
to focus on at the beginning. Understanding why the bootstrap most
often does not converge with a temporary datadir on Android devices
remains instead an open problem for now. (I'll also update the
relevant issues or create new issues after commit this.)

We also address some methodology improvements that were proposed
in ooni/probe#1686. Namely:

- we record the tor version because we include _some_ tor logs;

- we include the bootstrap percentage because of the logs;

- we set the anomaly key correctly.

What remains to be done is the possibility of including Snowflake
events into the measurement, which is not possible until the new
improvements at common/event in snowflake.git are included into a
tagged version of snowflake itself. (I'll make sure to mention
this aspect to @cohosh in ooni/probe#2004.)

It also remains to be done to measure the amount of bytes sent
and received during the bootstrap, which will also probably
be part of a follow-up diff (or even pull request).

I also expect this diff to fail unit and integration tests, at least
because of reduced coverage. This is fine because I plan to adding
missing tests or fixing them as part of a follow-up diff.

If you're reviewing this diff, I'd recommend focusing on (1) whether
we're collecting good enough data for analysis and (2) whether the
data we collect is safe to collect, or we should collect less to err
more onto the safe side.
The tutorial mentioned a default initialized structure, but now
we want to use a constructor, hence mention it.
Copy link
Contributor Author

@bassosimone bassosimone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass of self review. I've highlighted when we should improve with testing and have added comments for the reviewers.

internal/cmd/ptxclient/ptxclient.go Show resolved Hide resolved
internal/engine/experiment/torsf/torsf.go Show resolved Hide resolved
internal/engine/experiment/torsf/torsf.go Show resolved Hide resolved
internal/engine/experiment/torsf/torsf.go Show resolved Hide resolved
internal/engine/experiment/torsf/torsf.go Show resolved Hide resolved
internal/engine/session.go Show resolved Hide resolved
internal/ptx/snowflake.go Show resolved Hide resolved
internal/ptx/snowflake.go Show resolved Hide resolved
internal/tunnel/tunnel.go Outdated Show resolved Hide resolved
@bassosimone
Copy link
Contributor Author

@hellais this pull request is not complete because it's still missing to adjust tests. However, it already contains ~finished code for methodological improvements with torsf. Thus, it would be great if you could (1) take a look at the diff with particular attention on the sections that I've marked as important and (2) test from the command line using miniooni (build with go build -v ./internal/cmd/miniooni, see the initial comment for usage info) and report back on the usefulness of the data structures that I am currently collecting. Thanks a lot! 🙏

The way I initially wrote torsf in this set of patches was broken
because we could only get logs on success.

What's more, I tried to get logs on failure, which crashed torsf.

Luckily we have unit tests and we noticed.

So, rewrite the code such that we always can access logs.

While there, rework torsf unit tests to ensure they cover all the
cases and they're sorted logically into the file.

While there, start addressing @hellais suggestion that we should
get the version of tor from the control port.

After this diff, we still have some broken or missing tests, and I
will work on that later today or next week.
@bassosimone bassosimone changed the title feat(torsf): collect tor logs, select rendezvous method feat(torsf): collect tor logs, select rendezvous method, count bytes Feb 4, 2022
@bassosimone
Copy link
Contributor Author

This PR is approaching readiness state but there's still need to address a couple of comments!

bassosimone added a commit to ooni/spec that referenced this pull request Feb 7, 2022
This diff documents new options we have added to the torsf experiment
after the ooni/probe#2004 discussion.

The related probe-cli PR is: ooni/probe-cli#683
@bassosimone bassosimone marked this pull request as ready for review February 7, 2022 13:05
@hellais
Copy link
Member

hellais commented Feb 7, 2022

In doing some testing of this, I noticed that if I run the torsf test without any arguments in miniooni I get the following error:

miniooni torsf
[      0.000222] <info> Current time: 2022-02-07 14:57:03 CET
[      0.002505] <info> miniooni home directory: $HOME/.miniooni
[      0.002755] <info> Looking up OONI backends; please be patient...
[      0.733684] <info> session: using probe services: {Address:https://ps1.ooni.io Type:https Front:}
[      0.733706] <info> Looking up your location; please be patient...
[      4.567626] <info> - country: IT
[      4.567656] <info> - network: Vodafone Italia S.p.A. (AS30722)
[      4.567666] <info> - resolver's IP: 172.253.12.129
[      4.567670] <info> - resolver's network: Google LLC (AS15169)
[      4.567683] <warn> cannot create experiment builder: map[error:no such experiment: torsf]
[      4.568241] <info> sessionresolver: failure rate: primary: 4/4; fallback: 0/4
[      4.568447] <info> whole session: recv   7 Mbyte, sent   5 kbyte
cannot create experiment builder%

I guess it's fine to require specifying correct arguments as miniooni is an experimental client, but if it's not too hard we might want to still support running the test with default arguments or improve the returned error (the fact I read map[error:no such experiment: torsf], made me wonder if I was on the correct branch).

Ignore this, I was running the miniooni I had in the PATH and not the one I built from the branch. Ignore this noise.

@bassosimone
Copy link
Contributor Author

bassosimone commented Feb 7, 2022

I guess it's fine to require specifying correct arguments as miniooni is an experimental client, but if it's not too hard we might want to still support running the test with default arguments or improve the returned error (the fact I read map[error:no such experiment: torsf], made me wonder if I was on the correct branch).

I find this error very surprising. ./miniooni torsf should work just fine. What is your current branch and what is its tip?

bassosimone added a commit to ooni/spec that referenced this pull request Feb 7, 2022
This diff documents new options we have added to the torsf experiment
after the ooni/probe#2004 discussion.

The related probe-cli PR is: ooni/probe-cli#683
@hellais hellais self-requested a review February 7, 2022 16:03
Copy link
Member

@hellais hellais left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bassosimone bassosimone merged commit 85664f1 into master Feb 7, 2022
@bassosimone bassosimone deleted the issue/2004 branch February 7, 2022 16:05
ainghazal pushed a commit to ainghazal/probe-cli that referenced this pull request Mar 8, 2022
…oni#683)

This diff contains significant improvements over the previous
implementation of the torsf experiment.

We add support for configuring different rendezvous methods after
the convo at ooni/probe#2004. In doing
that, I've tried to use a terminology that is consistent with the
names being actually used by tor developers.

In terms of what to do next, this diff basically instruments
torsf to always rendezvous using domain fronting. Yet, it's also
possible to change the rendezvous method from the command line,
when using miniooni, which allows to experiment a bit more. In the
same vein, by default we use a persistent tor datadir, but it's
also possible to use a temporary datadir using the cmdline.

Here's how a generic invocation of `torsf` looks like:

```bash
./miniooni -O DisablePersistentDatadir=true \
           -O RendezvousMethod=amp \
           -O DisableProgress=true \
           torsf
```

(The default is `DisablePersistentDatadir=false` and
`RendezvousMethod=domain_fronting`.)

With this implementation, we can start measuring whether snowflake
and tor together can boostrap, which seems the most important thing
to focus on at the beginning. Understanding why the bootstrap most
often does not converge with a temporary datadir on Android devices
remains instead an open problem for now. (I'll also update the
relevant issues or create new issues after commit this.)

We also address some methodology improvements that were proposed
in ooni/probe#1686. Namely:

1. we record the tor version;

2. we include the bootstrap percentage by reading the logs;

3. we set the anomaly key correctly;

4. we measure the bytes send and received (by `tor` not by `snowflake`, since
doing it for snowflake seems more complex at this stage).

What remains to be done is the possibility of including Snowflake
events into the measurement, which is not possible until the new
improvements at common/event in snowflake.git are included into a
tagged version of snowflake itself. (I'll make sure to mention
this aspect to @cohosh in ooni/probe#2004.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants