torsf: improvements over initial proof of concept #1686

bassosimone · 2021-06-18T11:39:37Z

This is an umbrella/epic issue detailing the next steps for the torsf experiment:

record the tor version
include the bootstrap percentage
include snowflake logs (see torsf: use snowflake v2.1.0 event channel API #2017)
measure bytes send/received by bootstrap
set the anomaly key correctly
exercise the codebase in more scenarios

We introduced the torsf experiment in ooni/probe-cli#387 and ooni/spec#218.

The text was updated successfully, but these errors were encountered:

See ooni/probe#1565 Future work at ooni/probe#1686

The current implementation assumes the user has already installed tor on the current system. If tor is not present, the experiment fails. This is meant to be the first version of this experiment. We are going to add more functionality in subsequent revisions of this experiment, once we've collected more feedback. Reference issue: ooni/probe#1565. Here's the spec PR: ooni/spec#218. Here's the issue tracking future work: ooni/probe#1686

@cohosh

This diff contains significant improvements over the previous implementation of the torsf experiment. We add support for configuring different rendezvous methods after the convo at ooni/probe#2004. In doing that, I've tried to use a terminology that is consistent with the names being actually used by tor developers. In terms of what to do next, this diff basically instruments torsf to always rendezvous using domain fronting. Yet, it's also possible to change the rendezvous method from the command line, when using miniooni, which allows to experiment a bit more. In the same vein, by default we use a persistent tor datadir, but it's also possible to use a temporary datadir using the cmdline. Here's how a generic invocation of `torsf` looks like: ```bash ./miniooni -ODisablePersistentDatadir=true \ -ORendezvousMethod=amp \ -ODisableProgress=true torsf ``` (The default is `DisablePersistentDatadir=false` and `RendezvousMethod=domain_fronting`.) With this implementation, we can start measuring whether snowflake and tor together can boostrap, which seems the most important thing to focus on at the beginning. Understanding why the bootstrap most often does not converge with a temporary datadir on Android devices remains instead an open problem for now. (I'll also update the relevant issues or create new issues after commit this.) We also address some methodology improvements that were proposed in ooni/probe#1686. Namely: - we record the tor version because we include _some_ tor logs; - we include the bootstrap percentage because of the logs; - we set the anomaly key correctly. What remains to be done is the possibility of including Snowflake events into the measurement, which is not possible until the new improvements at common/event in snowflake.git are included into a tagged version of snowflake itself. (I'll make sure to mention this aspect to @cohosh in ooni/probe#2004.) It also remains to be done to measure the amount of bytes sent and received during the bootstrap, which will also probably be part of a follow-up diff (or even pull request). I also expect this diff to fail unit and integration tests, at least because of reduced coverage. This is fine because I plan to adding missing tests or fixing them as part of a follow-up diff. If you're reviewing this diff, I'd recommend focusing on (1) whether we're collecting good enough data for analysis and (2) whether the data we collect is safe to collect, or we should collect less to err more onto the safe side.

It seems, in the grand scheme of things, this is the log we need. So we just introduced a regexp to extract it in ooni/probe-cli@bacab49. Part of ooni/probe#2004 and ooni/probe#1686

@cohosh

…683) This diff contains significant improvements over the previous implementation of the torsf experiment. We add support for configuring different rendezvous methods after the convo at ooni/probe#2004. In doing that, I've tried to use a terminology that is consistent with the names being actually used by tor developers. In terms of what to do next, this diff basically instruments torsf to always rendezvous using domain fronting. Yet, it's also possible to change the rendezvous method from the command line, when using miniooni, which allows to experiment a bit more. In the same vein, by default we use a persistent tor datadir, but it's also possible to use a temporary datadir using the cmdline. Here's how a generic invocation of `torsf` looks like: ```bash ./miniooni -O DisablePersistentDatadir=true \ -O RendezvousMethod=amp \ -O DisableProgress=true \ torsf ``` (The default is `DisablePersistentDatadir=false` and `RendezvousMethod=domain_fronting`.) With this implementation, we can start measuring whether snowflake and tor together can boostrap, which seems the most important thing to focus on at the beginning. Understanding why the bootstrap most often does not converge with a temporary datadir on Android devices remains instead an open problem for now. (I'll also update the relevant issues or create new issues after commit this.) We also address some methodology improvements that were proposed in ooni/probe#1686. Namely: 1. we record the tor version; 2. we include the bootstrap percentage by reading the logs; 3. we set the anomaly key correctly; 4. we measure the bytes send and received (by `tor` not by `snowflake`, since doing it for snowflake seems more complex at this stage). What remains to be done is the possibility of including Snowflake events into the measurement, which is not possible until the new improvements at common/event in snowflake.git are included into a tagged version of snowflake itself. (I'll make sure to mention this aspect to @cohosh in ooni/probe#2004.)

The current implementation assumes the user has already installed tor on the current system. If tor is not present, the experiment fails. This is meant to be the first version of this experiment. We are going to add more functionality in subsequent revisions of this experiment, once we've collected more feedback. Reference issue: ooni/probe#1565. Here's the spec PR: ooni/spec#218. Here's the issue tracking future work: ooni/probe#1686

@cohosh

…oni#683) This diff contains significant improvements over the previous implementation of the torsf experiment. We add support for configuring different rendezvous methods after the convo at ooni/probe#2004. In doing that, I've tried to use a terminology that is consistent with the names being actually used by tor developers. In terms of what to do next, this diff basically instruments torsf to always rendezvous using domain fronting. Yet, it's also possible to change the rendezvous method from the command line, when using miniooni, which allows to experiment a bit more. In the same vein, by default we use a persistent tor datadir, but it's also possible to use a temporary datadir using the cmdline. Here's how a generic invocation of `torsf` looks like: ```bash ./miniooni -O DisablePersistentDatadir=true \ -O RendezvousMethod=amp \ -O DisableProgress=true \ torsf ``` (The default is `DisablePersistentDatadir=false` and `RendezvousMethod=domain_fronting`.) With this implementation, we can start measuring whether snowflake and tor together can boostrap, which seems the most important thing to focus on at the beginning. Understanding why the bootstrap most often does not converge with a temporary datadir on Android devices remains instead an open problem for now. (I'll also update the relevant issues or create new issues after commit this.) We also address some methodology improvements that were proposed in ooni/probe#1686. Namely: 1. we record the tor version; 2. we include the bootstrap percentage by reading the logs; 3. we set the anomaly key correctly; 4. we measure the bytes send and received (by `tor` not by `snowflake`, since doing it for snowflake seems more complex at this stage). What remains to be done is the possibility of including Snowflake events into the measurement, which is not possible until the new improvements at common/event in snowflake.git are included into a tagged version of snowflake itself. (I'll make sure to mention this aspect to @cohosh in ooni/probe#2004.)

cohosh · 2023-10-09T19:22:30Z

After our meeting, I looked into using Snowflake without Tor and it's much easier than I originally thought. Ever since we implemented a Go API that was motivated by the v2.1 PT specification, it is easy to do so just by calling Snowflake as a library. Consider the following Go code:

package main

import (
    "log"

    sf "gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/v2/client/lib"
)

func main() {

    config := sf.ClientConfig{
        BrokerURL:   "https://snowflake-broker.torproject.net.global.prod.fastly.net/",
        FrontDomain: "foursquare.com",
        ICEAddresses: []string{
            "stun:stun.l.google.com:19302",
            "stun:stun.antisip.com:3478",
            "stun:stun.bluesip.net:3478",
            "stun:stun.dus.net:3478",
            "stun:stun.epygi.com:3478",
            "stun:stun.sonetel.com:3478",
            "stun:stun.uls.co.za:3478",
            "stun:stun.voipgate.com:3478",
            "stun:stun.voys.nl:3478",
        },
        UTLSClientID:      "hellorandomizedalpn",
        BridgeFingerprint: "8838024498816A039FCBBAB14E6F40A0843051FA",
        Max:               1,
    }
    transport, err := sf.NewSnowflakeClient(config)
    if err != nil {
        log.Fatal("Failed to start snowflake transport: ", err)
    }

    conn, err := transport.Dial()
    if err != nil {
        log.Printf("dial error: %s", err)
        return
    }
    _, err = conn.Write([]byte("Nonsense"))
    if err != nil {
        log.Printf(err.Error())
    }
    b := make([]byte, 1)
    _, err = conn.Read(b)
    log.Printf("Connection to bridge closed with %s", err)
    defer conn.Close()

}

The trick is to trigger a response from the bridge. If we don't successfully read bytes from the connection returned by transport.Dial, then there's no guarantee we've actually reached the bridge. For this example, I just wrote enough "nonsense" bytes to trigger the bridge to close the connection. There might be a more gentle way of doing this, it's probably worth asking someone from the network team.

This is going to drastically decrease the test times for Snowflake and the incidences of errors caused by Tor crashing. I also think it will give us all the information we need from OONI tests, which is whether the bridge is reachable and the amount of time it takes to get a working Snowflake proxy to form that connection.

Doing things this way should also make it really easy to integrate the event channel API as well.

cohosh · 2023-10-09T19:37:19Z

If we don't successfully read bytes from the connection returned by transport.Dial, then there's no guarantee we've actually reached the bridge.

I should amend this to saying that if we don't either read bytes or receive an EOF, which is what is happening here in this case.

bassosimone added the triage label Jun 18, 2021

bassosimone added this to the Sprint 42 - Kujira milestone Jun 18, 2021

bassosimone self-assigned this Jun 18, 2021

bassosimone added the epic label Jun 18, 2021

bassosimone mentioned this issue Jun 18, 2021

doc: document ts-030-torsf ooni/spec#218

Merged

bassosimone added a commit to ooni/spec that referenced this issue Jun 18, 2021

doc: document ts-030-torsf (#218)

08f87ff

See ooni/probe#1565 Future work at ooni/probe#1686

bassosimone mentioned this issue Jun 18, 2021

feat(torsf): experiment that bootstraps tor using snowflake ooni/probe-cli#387

Merged

bassosimone mentioned this issue Jun 18, 2021

engine: we can be a snowflake pluggable transport for tor #1565

Closed

bassosimone modified the milestones: Sprint 42 - Kujira, Sprint 43 - Confused Kraken, Sprint 45 - Antarctic Krill Jul 5, 2021

bassosimone added priority/medium and removed triage labels Jul 5, 2021

bassosimone mentioned this issue Jul 30, 2021

torsf: ship on mobile and desktop #1717

Closed

bassosimone modified the milestones: Sprint 45 - Antarctic Krill, Sprint 46 - Happy Oyster Aug 16, 2021

hellais modified the milestones: Sprint 46 - Happy Oyster, Sprint 48 - Amazon river dolphin Sep 10, 2021

hellais modified the milestones: Sprint 48 - Amazon river dolphin, Sprint 49 - Humpback whale Sep 30, 2021

bassosimone modified the milestones: Sprint 49 - Humpback whale, Sprint 50 - Amphipoda Oct 8, 2021

bassosimone added the ooni/probe-engine label Oct 11, 2021

bassosimone mentioned this issue Oct 11, 2021

tor: next steps and methodology improvements #1730

Open

7 tasks

bassosimone mentioned this issue Feb 4, 2022

feat(torsf): collect tor logs, select rendezvous method, count bytes ooni/probe-cli#683

Merged

3 tasks

hellais added this to Sprint Planning Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

torsf: improvements over initial proof of concept #1686

torsf: improvements over initial proof of concept #1686

bassosimone commented Jun 18, 2021 •

edited

Loading

cohosh commented Oct 9, 2023

cohosh commented Oct 9, 2023

torsf: improvements over initial proof of concept #1686

torsf: improvements over initial proof of concept #1686

Comments

bassosimone commented Jun 18, 2021 • edited Loading

cohosh commented Oct 9, 2023

cohosh commented Oct 9, 2023

bassosimone commented Jun 18, 2021 •

edited

Loading