Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(engine): allow runner to return many measurements #527

Merged
merged 1 commit into from
Sep 29, 2021

Conversation

bassosimone
Copy link
Contributor

@bassosimone bassosimone commented Sep 29, 2021

Checklist

Location of the issue tracker: https://github.com/ooni/probe

Description

This is required to implement websteps, which is currently tracked
by ooni/probe#1733.

We introduce the concept of async runner. An async runner will
post measurements on a channel until it is done. When it is done,
it will close the channel to notify the reader about that.

This change causes sync experiments now to strictly return either
a non-nil measurement or a non-nil error.

While this is a pretty much obvious situation in golang, we had
some parts of the codebase that were not robust to this assumption
and attempted to submit a measurement after the measure call
returned an error.

Luckily, we had enough tests to catch this change in our assumption
and this is why there are extra docs and tests changes.

This is required to implement websteps, which is currently tracked
by ooni/probe#1733.

We introduce the concept of async runner. An async runner will
post measurements on a channel until it is done. When it is done,
it will close the channel to notify the reader about that.

This change causes sync experiments now to strictly return either
a non-nil measurement or a non-nil error.

While this is a pretty much obvious situation in golang, we had
some parts of the codebase that were not robust to this assumption
and attempted to submit a measurement after the measure call
returned an error.

Luckily, we had enough tests to catch this change in our assumption
and this is why there are extra docs and tests changes.
if measurement == nil {
t.Fatal("expected non nil measurement here")
if measurement != nil {
t.Fatal("expected nil measurement here")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the first place where our assumptions were too lax wrt the measurement XOR error return value.

// implemented async measurements, the case where there is an error
// and we also have a valid measurement cant't happen anymore. So,
// now the only valid strategy here is to continue.
continue
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the second place where our assumptions were too lax wrt the measurement XOR error return value.

@bassosimone
Copy link
Contributor Author

Here's the failure that occurred above:

goroutine 70 [select]:
github.com/ooni/probe-cli/v3/internal/ptx.(*SnowflakeDialer).dialContext(0xfa99c0, {0xca6a10, 0xc0000bc008})
	/home/runner/work/probe-cli/probe-cli/internal/ptx/snowflake.go:73 +0x3fd
github.com/ooni/probe-cli/v3/internal/ptx.(*SnowflakeDialer).DialContext(0x39, {0xca6a10, 0xc0000bc008})
	/home/runner/work/probe-cli/probe-cli/internal/ptx/snowflake.go:45 +0x59
github.com/ooni/probe-cli/v3/internal/ptx.TestSnowflakeDialerWorksWithMocks(0xc0003b56c0)
	/home/runner/work/probe-cli/probe-cli/internal/ptx/snowflake_test.go:62 +0xa7
testing.tRunner(0xc0003b56c0, 0xbfde00)
	/opt/hostedtoolcache/go/1.17.1/x64/src/testing/testing.go:1259 +0x230
created by testing.(*T).Run
	/opt/hostedtoolcache/go/1.17.1/x64/src/testing/testing.go:1306 +0x727
FAIL	github.com/ooni/probe-cli/v3/internal/ptx	600.114s

And here's the link to the build: https://github.com/ooni/probe-cli/pull/527/checks?check_run_id=3750039913

I am going to merge anyway, since this failure really seems unrelated (it's testing snowflake in there).

I am also going to create an issue for this.

@bassosimone bassosimone merged commit ff1c170 into master Sep 29, 2021
@bassosimone bassosimone deleted the issue/1733 branch September 29, 2021 22:54
bassosimone added a commit that referenced this pull request Jan 7, 2022
Since #527, if an experiment
returns an error, the corresponding measurement is not submitted since
the semantics of returning an error is that something fundamental
went wrong (e.g., we could not parse the input URL).

This diff ensures that all experiments only return and error when
something fundamental was wrong and return nil otherwise.

Reference issue: ooni/probe#1808.
bassosimone added a commit that referenced this pull request Jan 7, 2022
Since #527, if an experiment
returns an error, the corresponding measurement is not submitted since
the semantics of returning an error is that something fundamental
went wrong (e.g., we could not parse the input URL).

This diff ensures that all experiments only return and error when
something fundamental was wrong and return nil otherwise.

Reference issue: ooni/probe#1808.
ainghazal pushed a commit to ainghazal/probe-cli that referenced this pull request Mar 8, 2022
This is required to implement websteps, which is currently tracked
by ooni/probe#1733.

We introduce the concept of async runner. An async runner will
post measurements on a channel until it is done. When it is done,
it will close the channel to notify the reader about that.

This change causes sync experiments now to strictly return either
a non-nil measurement or a non-nil error.

While this is a pretty much obvious situation in golang, we had
some parts of the codebase that were not robust to this assumption
and attempted to submit a measurement after the measure call
returned an error.

Luckily, we had enough tests to catch this change in our assumption
and this is why there are extra docs and tests changes.
ainghazal pushed a commit to ainghazal/probe-cli that referenced this pull request Mar 8, 2022
Since ooni#527, if an experiment
returns an error, the corresponding measurement is not submitted since
the semantics of returning an error is that something fundamental
went wrong (e.g., we could not parse the input URL).

This diff ensures that all experiments only return and error when
something fundamental was wrong and return nil otherwise.

Reference issue: ooni/probe#1808.
bassosimone added a commit that referenced this pull request Jun 29, 2022
This bug is one of these bugs that definitely help one to stay
humble and focused on improving the codebase.

Of course I <facepalmed> when I understood the root cause.

We did not move the annotations below the `if` which is checking
whether the measurement was successful when we refactored the
codebase to support returning multiple measurements per run, which
happened in #527.

While I am not going to whip myself too much because of this, it's
clearly a bummer that we didn't notice this bug back then. On top
of this, it's also quite sad it took us so much time to notice that
there was this bug inside the tree.

The lesson (hopefully) learned is probably that we need to be more
careful when we refactor and we should always ask the question of
whether, not only we have tests, but whether these tests could maybe
be improved to give us even more confidence about correctness.

The reference issue is ooni/probe#2173.
bassosimone added a commit that referenced this pull request Jul 1, 2022
This bug is one of these bugs that definitely help one to stay
humble and focused on improving the codebase.

Of course I `<facepalmed>` when I understood the root cause.

We did not move the annotations below the `if` which is checking
whether the measurement was successful when we refactored the
codebase to support returning multiple measurements per run, which
happened in #527.

While I am not going to whip myself too much because of this, it's
clearly a bummer that we didn't notice this bug back then. On top
of this, it's also quite sad it took us so much time to notice that
there was this bug inside the tree.

The lesson (hopefully) learned is probably that we need to be more
careful when we refactor and we should always ask the question of
whether, not only we have tests, but whether these tests could maybe
be improved to give us even more confidence about correctness.

The reference issue is ooni/probe#2173.
bassosimone added a commit that referenced this pull request Jul 1, 2022
This commit backports 9b08dca
from the master branch. Originaly commit message follows:

- - -

This bug is one of these bugs that definitely help one to stay
humble and focused on improving the codebase.

Of course I `<facepalmed>` when I understood the root cause.

We did not move the annotations below the `if` which is checking
whether the measurement was successful when we refactored the
codebase to support returning multiple measurements per run, which
happened in #527.

While I am not going to whip myself too much because of this, it's
clearly a bummer that we didn't notice this bug back then. On top
of this, it's also quite sad it took us so much time to notice that
there was this bug inside the tree.

The lesson (hopefully) learned is probably that we need to be more
careful when we refactor and we should always ask the question of
whether, not only we have tests, but whether these tests could maybe
be improved to give us even more confidence about correctness.

The reference issue is ooni/probe#2173.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant