Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Netx research and refactoring in Sprint 11 #509

Closed
7 of 10 tasks
bassosimone opened this issue Apr 14, 2020 · 1 comment
Closed
7 of 10 tasks

Netx research and refactoring in Sprint 11 #509

bassosimone opened this issue Apr 14, 2020 · 1 comment
Assignees
Labels
effort/L Large effort priority/high High priority

Comments

@bassosimone
Copy link
Contributor

bassosimone commented Apr 14, 2020

The generic idea is that what ends up in the netx folder is the more complex implementation and the simpler implementation will be instead inside of internal. The main objective of this effort is to understand whether we can use the decorator pattern based technique also for collecting measurements. I think so, but it's better to be sure about this by changing Psiphon and evaluating whether this is really an improvement.

  • add to dialer, resolver, httptransport support for collecting raw data
  • allow to export raw data using OONI format
  • change psiphon to use this new style of measuring
  • move dialer, resolver, and httptransport to internal
  • move oonitemplate, oonidatamodel to netx
  • evaluate whether we lose anything with this much simpler implementation
  • merge useful code from prototype PR Implement measurements using decorator pattern #529
  • close PoC: netx rewrite with better design principles #421 if superseded
  • create new issues describing how to proceed
  • can we avoid using the context for counting bytes?

If all is good, we will have significantly simplified netx implementation and testability in a few sprints. If there are emerging issues, then it means we need to somehow change the plan again.

@bassosimone bassosimone added effort/L Large effort priority/high High priority labels Apr 14, 2020
@bassosimone bassosimone self-assigned this Apr 14, 2020
@bassosimone bassosimone added this to the Sprint 11 - Blue Bottles milestone Apr 14, 2020
bassosimone added a commit that referenced this issue Apr 16, 2020
This seems to be a better pattern for saving events that involves
doing much less magic with the context.

Part of #509
bassosimone added a commit that referenced this issue Apr 16, 2020
This seems to be a better pattern for saving events that involves
doing much less magic with the context.

Part of #509
bassosimone added a commit that referenced this issue Apr 18, 2020
This also includes a new experiment called urlgetter that can
be used to perform URL measurements like:

```
./miniooni -i http://www.kernel.org urlgetter

./miniooni -OTLSServerName=x.org -i http://www.kernel.org urlgetter

./miniooni -OResolverURL=https://cloudflare-dns.com/dns-query \
           -i http://www.kernel.org urlgetter
```

Part of #509
bassosimone added a commit that referenced this issue Apr 22, 2020
This is non standard. We don't need to add it to the spec. It is just
useful when you are vetting measurements manually.

Cherry-picked from work related to #509
bassosimone added a commit that referenced this issue Apr 22, 2020
This is non standard. We don't need to add it to the spec. It is just
useful when you are vetting measurements manually.

Cherry-picked from work related to #509
bassosimone added a commit that referenced this issue Apr 22, 2020
There is no cache aging. If a resolution is successful, then we
do cache the result. This is meant as a mechanism to make measurements
faster and with less data points. For this reason it's disabled by
default. Consider for example the case where `http://www.example.com`
redirects you to `https://www.example.com`. You really don't want
to perform the name resolution again in this case.

As a side note, depending on the resolver we're using, we may not
need the caching mechanism. Nonetheless, using the cache always when
measuring also reduces the number of duplicate entries we produce
where only the first entry has actually been resolved.

This was developed as part of #509
bassosimone added a commit that referenced this issue Apr 22, 2020
There is no cache aging. If a resolution is successful, then we
do cache the result. This is meant as a mechanism to make measurements
faster and with less data points. For this reason it's disabled by
default. Consider for example the case where `http://www.example.com`
redirects you to `https://www.example.com`. You really don't want
to perform the name resolution again in this case.

As a side note, depending on the resolver we're using, we may not
need the caching mechanism. Nonetheless, using the cache always when
measuring also reduces the number of duplicate entries we produce
where only the first entry has actually been resolved.

This was developed as part of #509
bassosimone added a commit that referenced this issue Apr 22, 2020
These are useful bits of information when filling a measurement
because they allow us to understand what transport is used.

Part of #509
bassosimone added a commit that referenced this issue Apr 22, 2020
These are useful bits of information when filling a measurement
because they allow us to understand what transport is used.

Part of #509
bassosimone added a commit that referenced this issue Apr 22, 2020
Bugfix developed as part of #509

While there, implement a unit test
bassosimone added a commit that referenced this issue Apr 22, 2020
Bugfix developed as part of #509

While there, implement a unit test
@bassosimone
Copy link
Contributor Author

bassosimone commented Apr 22, 2020

Done a bunch of work here and created followup as #543.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort/L Large effort priority/high High priority
Projects
None yet
Development

No branches or pull requests

1 participant