sdk: Remove TestWriter #7559

dnephin · 2020-03-31T19:56:12Z

Follow up to #7546

I went back and looked at #5344 to better understand the problem that TestWriter was solving.

I believe gotestsum, which we use in CI, solves this same problem without having to wrap the logger in the test suite. The short, short-verbose and short-with-failures formats handle only printing logs for the tests that have failed.

The TestWriter solution seems like it comes with a couple of disadvantages.

The first can be seen from the implementation. Loggers can outlive the lifetime of the testing.T instance, which causes panics.

The second is that it produces log messages which contain the test name twice. This results in log messages that are very long and hard to read. (#7546).

sdk/README/md suggests that we don't make any promises about backwards compatibility for the sdk, so I removed the unused functions.

The go toolchain already handles grouping test output by test. The TestWriter, generally used to send logs through t.Log causes a couple problems. The first can be seen from the implementation. Loggers can outlive the lifetime of the testing.T instance, which causes panics. The second is that it produces log messages which contain the test name twice. This results in log messages that are very long and hard to read. Sending logs to stdout should make it easier to read the test failure messages.

rboyer · 2020-03-31T20:23:27Z

Has go test (not gotestsum) somehow changed in the intervening versions since I added this so that Stdout/Stderr is somehow not all mixed together when you have dozens of parallel test cases running locally (during something like make test)?

Previously the problem I had was that if anything failed during make test you'd get a sea of log messages from other totally a-ok test cases interleaved into the logs for the test case that failed which made the failing test case completely useless.

If this is only true when using gotestsum I think we should keep the TestWriter shim in here so that regular go test usage locally doesn't have a regressed experience.

dnephin · 2020-03-31T20:40:00Z

go test works the same as it did. Why not use gotestsum locally? It was designed for this use case as well.

Sending all the logs to t.Log makes it very difficult to find the failures. The failure message is prefixed the same way as the log output, which hides the failure message in dozens of lines of log output. t.Log adds the file and line number, which for logs is not useful because its always the line from testWriter.

We could have the makefile install and run gotestsum is that helps.

mkeeler · 2020-03-31T20:44:18Z

Personally, I normally just use go test <arguments>. This would definitely break test output correlation for me.

dnephin · 2020-03-31T20:44:39Z

I think there is another problem with t.Log that I haven't been able to fully track down. An example is in this output: https://app.circleci.com/pipelines/github/hashicorp/consul/7995/workflows/9c8b6a9a-106c-4d45-8692-8bca89932ef4/jobs/144872

The failing test is TestHealthServiceChecks_NodeMetaFilter, but we can see logs for TestSnapshot_Options/GET mixed in.

It's possible this is a bug in test2json

dnephin · 2020-04-15T21:56:14Z

I went looking more into this problem, and it seems that in go1.14 they made some changes to support log streaming which made the problem of mixed up test output worse. Output from goroutines is not going to be attributed to the correct test when sent to test2json.

I'm going to experiment with a solution that buffers the logs for each test and prints them at the end if the test fails. The go tool already buffers this output, so I imagine it should be fine to keep all of the logs in memory for the lifetime of a test.

Replaces #7559 See golang/go#38458 and golang/go#38382 (comment) Running tests in parallel, with background goroutines, results in test output not being associated with the correct test. `go test` does not make any guarantees about output from goroutines being attributed to the correct test case. The previous solution did not address this problem because test output could still be hidden when it was associated with a test that did not fail. You would have to look at all of the log output to find the relevant lines. It also made debugging test failures more difficult because each log line was very long and contained the test name twice. This commit attempts a new approach. Instead of printing all the logs, only print when a test fails. This should work well when there are a small number of failures, but may not work well when there are many test failures at the same time. In those cases the failures are unlikely a result of a specific test, and the log output is likely less useful. All of the logs are printed from the test goroutine, so they should be associated with the correct test. Also removes some test helpers that were not used, or only had a single caller. Packages which expose many functions with similar names can be difficult to use correctly.

Replaces #7559 Running tests in parallel, with background goroutines, results in test output not being associated with the correct test. `go test` does not make any guarantees about output from goroutines being attributed to the correct test case. The previous solution did not address this problem because test output could still be hidden when it was associated with a test that did not fail. You would have to look at all of the log output to find the relevant lines. It also made debugging test failures more difficult because each log line was very long and contained the test name twice. This commit attempts a new approach. Instead of printing all the logs, only print when a test fails. This should work well when there are a small number of failures, but may not work well when there are many test failures at the same time. In those cases the failures are unlikely a result of a specific test, and the log output is likely less useful. All of the logs are printed from the test goroutine, so they should be associated with the correct test. Also removes some test helpers that were not used, or only had a single caller. Packages which expose many functions with similar names can be difficult to use correctly. Related: golang/go#38458 (may be fixed in go1.15) golang/go#38382 (comment)

dnephin · 2020-05-21T23:45:42Z

Replaced by #7948

Replaces #7559 Running tests in parallel, with background goroutines, results in test output not being associated with the correct test. `go test` does not make any guarantees about output from goroutines being attributed to the correct test case. The previous solution did not address this problem because test output could still be hidden when it was associated with a test that did not fail. You would have to look at all of the log output to find the relevant lines. It also made debugging test failures more difficult because each log line was very long and contained the test name twice. This commit attempts a new approach. Instead of printing all the logs, only print when a test fails. This should work well when there are a small number of failures, but may not work well when there are many test failures at the same time. In those cases the failures are unlikely a result of a specific test, and the log output is likely less useful. All of the logs are printed from the test goroutine, so they should be associated with the correct test. Also removes some test helpers that were not used, or only had a single caller. Packages which expose many functions with similar names can be difficult to use correctly. Related: golang/go#38458 (may be fixed in go1.15) golang/go#38382 (comment)

Replaces #7559 Running tests in parallel, with background goroutines, results in test output not being associated with the correct test. `go test` does not make any guarantees about output from goroutines being attributed to the correct test case. Attaching log output from background goroutines also cause data races. If the goroutine outlives the test, it will race with the test being marked done. Previously this was noticed as a panic when logging, but with the race detector enabled it is shown as a data race. The previous solution did not address the problem of correct test attribution because test output could still be hidden when it was associated with a test that did not fail. You would have to look at all of the log output to find the relevant lines. It also made debugging test failures more difficult because each log line was very long. This commit attempts a new approach. Instead of printing all the logs, only print when a test fails. This should work well when there are a small number of failures, but may not work well when there are many test failures at the same time. In those cases the failures are unlikely a result of a specific test, and the log output is likely less useful. All of the logs are printed from the test goroutine, so they should be associated with the correct test. Also removes some test helpers that were not used, or only had a single caller. Packages which expose many functions with similar names can be difficult to use correctly. Related: golang/go#38458 (may be fixed in go1.15) golang/go#38382 (comment)

paulburlumi · 2020-08-12T14:14:14Z

When upgrading the sdk from v0.5.0 to v0.6.0, commit 51efba2 caused our ginkgo unit tests to fail.

We have the following code set up to run before the test suite:

var _ = BeforeSuite(func() {
	ts, err := testutil.NewTestServerConfigT(nil, nil)
	Expect(err).ToNot(HaveOccurred())
	consulTestServer = ts
	Expect(consulTestServer).ToNot(BeNil())
})

This code works with sdk v0.5.0 and causes the following stack trace with sdk v0.6.0:

  Test Panicked
  runtime error: invalid memory address or nil pointer dereference
  /usr/local/go/src/runtime/panic.go:212

  Full Stack Trace
  github.com/hashicorp/consul/sdk/testutil.NewLogBuffer(0x0, 0x0, 0x6, 0x8)
        /go/pkg/mod/github.com/hashicorp/consul/sdk@v0.6.0/testutil/testlog.go:38 +0xba
  github.com/hashicorp/consul/sdk/testutil.defaultServerConfig(0x0, 0x0, 0x0)
        /go/pkg/mod/github.com/hashicorp/consul/sdk@v0.6.0/testutil/server.go:143 +0xc5
  github.com/hashicorp/consul/sdk/testutil.NewTestServerConfigT(0x0, 0x0, 0x0, 0x1, 0x1, 0xc000380060)
        /go/pkg/mod/github.com/hashicorp/consul/sdk@v0.6.0/testutil/server.go:239 +0x165

It looks like testutil.NewTestServerConfigT used to allow the first parameter to be nil (in fact it still checks this).

However t is now always accessed without checking for nil in NewLogBuffer.

This looks like a regression on the public API for the testutil package.

dnephin · 2020-08-12T14:28:40Z

The README for the sdk says:

Please note that this folder, while public, is not meant for new consumers of these libraries; this should currently be considered an internal, not external, SDK. It is public due to existing needs from other HashiCorp software. The tags in this folder will stay at the 0.x.y level; accordingly users should expect that things can move around, disappear, or change API at any time.

I think it is expected that a test helper requires a testing.T for logging, t.Cleanup, and checking t.Failed.

You should be able to continue to use NewTestServerConfigT by passing in GinkgoT().

paulburlumi · 2020-08-12T14:57:17Z

Thanks @dnephin. That's completely reasonable. I wasn't aware of the README comment.

I'll delay my sdk upgrade to v0.6.0 and wait for ginkgo to update GinkgoT() to support testing.TB.
Currently ginkgo is missing the Cleanup() func which is a shame (onsi/ginkgo#582).

dnephin added the theme/internal-cleanup Used to identify tech debt, testing improvements, code refactoring, and non-impactful optimization label Mar 31, 2020

dnephin marked this pull request as draft April 16, 2020 23:53

dnephin mentioned this pull request May 21, 2020

testutil: NewLogBuffer - buffer logs until a test fails #7948

Merged

dnephin closed this May 21, 2020

dnephin deleted the dnephin/remove-test-writer branch May 21, 2020 23:45

paulburlumi mentioned this pull request Sep 10, 2020

testutil: use TestingTB instead of testing.TB #8647

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sdk: Remove TestWriter #7559

sdk: Remove TestWriter #7559

dnephin commented Mar 31, 2020

rboyer commented Mar 31, 2020 •

edited

Loading

dnephin commented Mar 31, 2020

mkeeler commented Mar 31, 2020

dnephin commented Mar 31, 2020

dnephin commented Apr 15, 2020

dnephin commented May 21, 2020

paulburlumi commented Aug 12, 2020

dnephin commented Aug 12, 2020

paulburlumi commented Aug 12, 2020

sdk: Remove TestWriter #7559

sdk: Remove TestWriter #7559

Conversation

dnephin commented Mar 31, 2020

rboyer commented Mar 31, 2020 • edited Loading

dnephin commented Mar 31, 2020

mkeeler commented Mar 31, 2020

dnephin commented Mar 31, 2020

dnephin commented Apr 15, 2020

dnephin commented May 21, 2020

paulburlumi commented Aug 12, 2020

dnephin commented Aug 12, 2020

paulburlumi commented Aug 12, 2020

rboyer commented Mar 31, 2020 •

edited

Loading