Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sdk: Remove TestWriter #7559

Closed
wants to merge 1 commit into from
Closed

sdk: Remove TestWriter #7559

wants to merge 1 commit into from

Conversation

dnephin
Copy link
Contributor

@dnephin dnephin commented Mar 31, 2020

Follow up to #7546

I went back and looked at #5344 to better understand the problem that TestWriter was solving.

I believe gotestsum, which we use in CI, solves this same problem without having to wrap the logger in the test suite. The short, short-verbose and short-with-failures formats handle only printing logs for the tests that have failed.

The TestWriter solution seems like it comes with a couple of disadvantages.

The first can be seen from the implementation. Loggers can outlive the lifetime of the testing.T instance, which causes panics.

The second is that it produces log messages which contain the test name twice. This results in log messages that are very long and hard to read. (#7546).

sdk/README/md suggests that we don't make any promises about backwards compatibility for the sdk, so I removed the unused functions.

The go toolchain already handles grouping test output by test. The
TestWriter, generally used to send logs through t.Log causes a couple
problems.

The first can be seen from the implementation. Loggers can outlive the
lifetime of the testing.T instance, which causes panics.

The second is that it produces log messages which contain the test name
twice. This results in log messages that are very long and hard to read.

Sending logs to stdout should make it easier to read the test failure
messages.
@dnephin dnephin added the theme/internal-cleanup Used to identify tech debt, testing improvements, code refactoring, and non-impactful optimization label Mar 31, 2020
@rboyer
Copy link
Member

rboyer commented Mar 31, 2020

Has go test (not gotestsum) somehow changed in the intervening versions since I added this so that Stdout/Stderr is somehow not all mixed together when you have dozens of parallel test cases running locally (during something like make test)?

Previously the problem I had was that if anything failed during make test you'd get a sea of log messages from other totally a-ok test cases interleaved into the logs for the test case that failed which made the failing test case completely useless.

If this is only true when using gotestsum I think we should keep the TestWriter shim in here so that regular go test usage locally doesn't have a regressed experience.

@dnephin
Copy link
Contributor Author

dnephin commented Mar 31, 2020

go test works the same as it did. Why not use gotestsum locally? It was designed for this use case as well.

Sending all the logs to t.Log makes it very difficult to find the failures. The failure message is prefixed the same way as the log output, which hides the failure message in dozens of lines of log output. t.Log adds the file and line number, which for logs is not useful because its always the line from testWriter.

We could have the makefile install and run gotestsum is that helps.

@mkeeler
Copy link
Member

mkeeler commented Mar 31, 2020

Personally, I normally just use go test <arguments>. This would definitely break test output correlation for me.

@dnephin
Copy link
Contributor Author

dnephin commented Mar 31, 2020

I think there is another problem with t.Log that I haven't been able to fully track down. An example is in this output: https://app.circleci.com/pipelines/github/hashicorp/consul/7995/workflows/9c8b6a9a-106c-4d45-8692-8bca89932ef4/jobs/144872

The failing test is TestHealthServiceChecks_NodeMetaFilter, but we can see logs for TestSnapshot_Options/GET mixed in.

It's possible this is a bug in test2json

@dnephin
Copy link
Contributor Author

dnephin commented Apr 15, 2020

I went looking more into this problem, and it seems that in go1.14 they made some changes to support log streaming which made the problem of mixed up test output worse. Output from goroutines is not going to be attributed to the correct test when sent to test2json.

I'm going to experiment with a solution that buffers the logs for each test and prints them at the end if the test fails. The go tool already buffers this output, so I imagine it should be fine to keep all of the logs in memory for the lifetime of a test.

@dnephin dnephin marked this pull request as draft April 16, 2020 23:53
dnephin added a commit that referenced this pull request May 6, 2020
Replaces  #7559

See golang/go#38458 and golang/go#38382 (comment)

Running tests in parallel, with background goroutines, results in test output not being associated with the correct test. `go test` does not make any guarantees about output from goroutines being attributed to the correct test case.

The previous solution did not address this problem because test output could still be hidden when it was associated with a test that did not fail. You would have to look at all of the log output to find the relevant lines. It also made debugging test failures more difficult because each log line was very long and contained the test name twice.

This commit attempts a new approach. Instead of printing all the logs, only print when a test fails. This should work well when there are a small number of failures, but may not work well when there are many test failures at the same time. In those cases the failures are unlikely a result of a specific test, and the log output is likely less useful.

All of the logs are printed from the test goroutine, so they should be associated with the correct test.

Also removes some test helpers that were not used, or only had a single caller. Packages which expose many functions with similar names can be difficult to use correctly.
dnephin added a commit that referenced this pull request May 21, 2020
Replaces  #7559

Running tests in parallel, with background goroutines, results in test output not being associated with the correct test. `go test` does not make any guarantees about output from goroutines being attributed to the correct test case.

The previous solution did not address this problem because test output could still be hidden when it was associated with a test that did not fail. You would have to look at all of the log output to find the relevant lines. It also made debugging test failures more difficult because each log line was very long and contained the test name twice.

This commit attempts a new approach. Instead of printing all the logs, only print when a test fails. This should work well when there are a small number of failures, but may not work well when there are many test failures at the same time. In those cases the failures are unlikely a result of a specific test, and the log output is likely less useful.

All of the logs are printed from the test goroutine, so they should be associated with the correct test.

Also removes some test helpers that were not used, or only had a single caller. Packages which expose many functions with similar names can be difficult to use correctly.

Related:
golang/go#38458 (may be fixed in go1.15)
golang/go#38382 (comment)
@dnephin
Copy link
Contributor Author

dnephin commented May 21, 2020

Replaced by #7948

@dnephin dnephin closed this May 21, 2020
@dnephin dnephin deleted the dnephin/remove-test-writer branch May 21, 2020 23:45
dnephin added a commit that referenced this pull request Jun 9, 2020
Replaces  #7559

Running tests in parallel, with background goroutines, results in test output not being associated with the correct test. `go test` does not make any guarantees about output from goroutines being attributed to the correct test case.

The previous solution did not address this problem because test output could still be hidden when it was associated with a test that did not fail. You would have to look at all of the log output to find the relevant lines. It also made debugging test failures more difficult because each log line was very long and contained the test name twice.

This commit attempts a new approach. Instead of printing all the logs, only print when a test fails. This should work well when there are a small number of failures, but may not work well when there are many test failures at the same time. In those cases the failures are unlikely a result of a specific test, and the log output is likely less useful.

All of the logs are printed from the test goroutine, so they should be associated with the correct test.

Also removes some test helpers that were not used, or only had a single caller. Packages which expose many functions with similar names can be difficult to use correctly.

Related:
golang/go#38458 (may be fixed in go1.15)
golang/go#38382 (comment)
dnephin added a commit that referenced this pull request Jul 21, 2020
Replaces #7559

Running tests in parallel, with background goroutines, results in test output not being associated with the correct test. `go test` does not make any guarantees about output from goroutines being attributed to the correct test case.

Attaching log output from background goroutines also cause data races.  If the goroutine outlives the test, it will race with the test being marked done. Previously this was noticed as a panic when logging, but with the race detector enabled it is shown as a data race.

The previous solution did not address the problem of correct test attribution because test output could still be hidden when it was associated with a test that did not fail. You would have to look at all of the log output to find the relevant lines. It also made debugging test failures more difficult because each log line was very long.

This commit attempts a new approach. Instead of printing all the logs, only print when a test fails. This should work well when there are a small number of failures, but may not work well when there are many test failures at the same time. In those cases the failures are unlikely a result of a specific test, and the log output is likely less useful.

All of the logs are printed from the test goroutine, so they should be associated with the correct test.

Also removes some test helpers that were not used, or only had a single caller. Packages which expose many functions with similar names can be difficult to use correctly.

Related:
golang/go#38458 (may be fixed in go1.15)
golang/go#38382 (comment)
@paulburlumi
Copy link
Contributor

When upgrading the sdk from v0.5.0 to v0.6.0, commit 51efba2 caused our ginkgo unit tests to fail.

We have the following code set up to run before the test suite:

var _ = BeforeSuite(func() {
	ts, err := testutil.NewTestServerConfigT(nil, nil)
	Expect(err).ToNot(HaveOccurred())
	consulTestServer = ts
	Expect(consulTestServer).ToNot(BeNil())
})

This code works with sdk v0.5.0 and causes the following stack trace with sdk v0.6.0:

  Test Panicked
  runtime error: invalid memory address or nil pointer dereference
  /usr/local/go/src/runtime/panic.go:212

  Full Stack Trace
  github.com/hashicorp/consul/sdk/testutil.NewLogBuffer(0x0, 0x0, 0x6, 0x8)
        /go/pkg/mod/github.com/hashicorp/consul/sdk@v0.6.0/testutil/testlog.go:38 +0xba
  github.com/hashicorp/consul/sdk/testutil.defaultServerConfig(0x0, 0x0, 0x0)
        /go/pkg/mod/github.com/hashicorp/consul/sdk@v0.6.0/testutil/server.go:143 +0xc5
  github.com/hashicorp/consul/sdk/testutil.NewTestServerConfigT(0x0, 0x0, 0x0, 0x1, 0x1, 0xc000380060)
        /go/pkg/mod/github.com/hashicorp/consul/sdk@v0.6.0/testutil/server.go:239 +0x165

It looks like testutil.NewTestServerConfigT used to allow the first parameter to be nil (in fact it still checks this).

However t is now always accessed without checking for nil in NewLogBuffer.

This looks like a regression on the public API for the testutil package.

@dnephin
Copy link
Contributor Author

dnephin commented Aug 12, 2020

The README for the sdk says:

Please note that this folder, while public, is not meant for new consumers of these libraries; this should currently be considered an internal, not external, SDK. It is public due to existing needs from other HashiCorp software. The tags in this folder will stay at the 0.x.y level; accordingly users should expect that things can move around, disappear, or change API at any time.

I think it is expected that a test helper requires a testing.T for logging, t.Cleanup, and checking t.Failed.

You should be able to continue to use NewTestServerConfigT by passing in GinkgoT().

@paulburlumi
Copy link
Contributor

Thanks @dnephin. That's completely reasonable. I wasn't aware of the README comment.

I'll delay my sdk upgrade to v0.6.0 and wait for ginkgo to update GinkgoT() to support testing.TB.
Currently ginkgo is missing the Cleanup() func which is a shame (onsi/ginkgo#582).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
theme/internal-cleanup Used to identify tech debt, testing improvements, code refactoring, and non-impactful optimization
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants