Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(engine): add randomtraffic experiment #1026

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

JaxGames5225
Copy link

@JaxGames5225 JaxGames5225 commented Jan 13, 2023

Checklist

Description

This test aims to detect the censorship of fully random traffic. In short, the experiment sends random bytes to an IP address chosen at random from a list of pre-determined public IP addresses that were affected by this censorship in the past and records information about the nature of censorship. This censorship was originally detected from the Great Firewall of China (GFW).

Censorship Description

Our team reverse engineered the GFW's new censorship system and determined that it uses the following rules to exempt traffic from blocking:

For the first TCP payload sent by the client, allow the traffic to continue if any of the following hold:

  • It matches the protocol fingerprint for TLS or HTTP.
  • The first six bytes of the payload are all [0x20, 0x7e].
  • More than 50% of the payload are [0x20, 0x7e].
  • More than 20 contiguous bytes of the payload are [0x20, 0x7e].
  • popcount(payload)/len(payload) is less than 3.4 or greater than 4.6.

In addition to these rules, the censorship only occurs when connecting to a certain list of IP addresses.

If the IP address is in the censored range and none of the above hold, there is an approximate 26.3% chance the connection is censored. For a more detailed description of the censorship please see the reading copy of our paper.

Test Goals and Procedure

The main goal of the test is to inform the user whether or not they are experiencing censorship on connections that send fully encrypted packets that appear random, as well as to record information about censored packets in order to better understand the censorship algorithm. The test seeks to accomplish these goals by doing the following:

  1. If no IP address is given by the user, select an IP address from the list of IP addresses in the affected range.
  2. Complete a TCP handshake with the IP address and send a stream of null bytes as a control test. If this control test succeeds then proceed with the experiment, otherwise attempt the control test with a new IP address two more times or until the control test is successful. If no control test succeeds end the test and return the error.
  3. Complete a TCP handshake with the IP address and send a stream of random bytes. If this connection times out, we attempt to connect once more to check for residual censorship. If the residual censorship test results in a timeout, we end the test, record information about the blocked packet, and inform the user they are experiencing censorship. Otherwise we continue with the test.
  4. Step 3 is repeated 19 more times to account for the blocking rate.
  5. If no errors occurred and the test was completed, all connections are then closed and the test informs the user they are not experiencing censorship.

False Negative and False Positive Rates

Using an IP known to be in the censored range, the false negative rate (the rate at which the test will say there is no censorship present when in fact there is) of this test was calculated to be approximately 1.05%. On the other hand, after running the test 10,000 times from a location not experiencing censorship, no false positives were recorded.

IP List Construction

The IP list was created by first obtaining a large list of public TCP servers. The test was then performed five times on each IP from a computer where censorship is expected. The final list of IP addresses is made up of only the IP addresses which reported censorship all five times. In order for one of these IP addresses to not be in the censored range, each of the five reports of censorship would have had to have been false positives, which we know to be extremely unlikely, meaning we can label these IP addresses as in the censored range.

@bassosimone bassosimone changed the title Feature/randomtraffic feat(engine): add randomtraffic experiment Apr 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant