panic when converting "go test -bench" output #413

pohly · 2024-06-14T13:26:32Z

In Kubernetes, we run go test -json and then invoke gotestsum to convert to JUnit.

In one particular job which only runs benchmarks, gotestsum panics. This can be reproduce with this stripped down result file (all output excluded):

$ cat >/tmp/junit.stdout <<EOF
{"Time":"2024-06-14T12:25:50.699286648Z","Action":"start","Package":"k8s.io/kubernetes/test/integration/scheduler_perf"}
{"Time":"2024-06-14T12:26:10.312830592Z","Action":"skip","Package":"k8s.io/kubernetes/test/integration/scheduler_perf","Test":"BenchmarkPerfScheduling/SchedulingBasic/500Nodes"}
{"Time":"2024-06-14T12:26:38.221564195Z","Action":"skip","Package":"k8s.io/kubernetes/test/integration/scheduler_perf","Test":"BenchmarkPerfScheduling/SchedulingPodAntiAffinity/500Nodes"}
{"Time":"2024-06-14T12:32:04.974837588Z","Action":"skip","Package":"k8s.io/kubernetes/test/integration/scheduler_perf","Test":"BenchmarkPerfScheduling/SchedulingSecrets/500Nodes"}
{"Time":"2024-06-14T12:34:09.428169073Z","Action":"skip","Package":"k8s.io/kubernetes/test/integration/scheduler_perf","Test":"BenchmarkPerfScheduling/SchedulingPodAffinity/500Nodes"}
{"Time":"2024-06-14T12:36:07.918651403Z","Action":"skip","Package":"k8s.io/kubernetes/test/integration/scheduler_perf","Test":"BenchmarkPerfScheduling/SchedulingPreferredPodAntiAffinity/500Nodes"}
{"Time":"2024-06-14T12:36:07.918695493Z","Action":"fail","Package":"k8s.io/kubernetes/test/integration/scheduler_perf","Test":"BenchmarkPerfScheduling/SchedulingDaemonset/15000Nodes"}
{"Time":"2024-06-14T12:36:08.570166203Z","Action":"fail","Package":"k8s.io/kubernetes/test/integration/scheduler_perf","Test":"BenchmarkPerfScheduling/SchedulingDaemonset"}
{"Time":"2024-06-14T12:36:51.170763116Z","Action":"skip","Package":"k8s.io/kubernetes/test/integration/scheduler_perf","Test":"BenchmarkPerfScheduling/SchedulingNodeAffinity/500Nodes"}
{"Time":"2024-06-14T12:36:51.170777921Z","Action":"fail","Package":"k8s.io/kubernetes/test/integration/scheduler_perf","Elapsed":660.472}
EOF

gotestsum -- --junitfile /tmp/junit.xml --raw-command cat /tmp/junit.out

The version v1.12.0 fails:

   352:	func packageLine(event TestEvent, pkg *Package) string {
   353:		var buf strings.Builder
   354:		buf.WriteString(RelativePackagePath(event.Package))
   355:	
   356:		switch {
=> 357:		case pkg.cached:
   358:			buf.WriteString(" (cached)")
   359:		case event.Elapsed != 0:
   360:			d := elapsedDuration(event.Elapsed)
   361:			buf.WriteString(fmt.Sprintf(" (%s)", d))
   362:		}
(dlv) bt
 0  0x0000000000440d24 in runtime.fatalpanic
    at /nvme/gopath/go-1.22.3/src/runtime/panic.go:1217
 1  0x000000000043fd1d in runtime.gopanic
    at /nvme/gopath/go-1.22.3/src/runtime/panic.go:779
 2  0x000000000043efbe in runtime.panicmem
    at /nvme/gopath/go-1.22.3/src/runtime/panic.go:261
 3  0x0000000000458da5 in runtime.sigpanic
    at /nvme/gopath/go-1.22.3/src/runtime/signal_unix.go:881
 4  0x0000000000655180 in gotest.tools/gotestsum/testjson.packageLine
    at /nvme/gopath/pkg/mod/gotest.tools/gotestsum@v1.12.0/testjson/format.go:357
 5  0x0000000000655078 in gotest.tools/gotestsum/testjson.shortFormatPackageEvent.func1
    at /nvme/gopath/pkg/mod/gotest.tools/gotestsum@v1.12.0/testjson/format.go:330
 6  0x0000000000654c9a in gotest.tools/gotestsum/testjson.shortFormatPackageEvent
    at /nvme/gopath/pkg/mod/gotest.tools/gotestsum@v1.12.0/testjson/format.go:347
 7  0x000000000065373e in gotest.tools/gotestsum/testjson.pkgNameFormat.func1
    at /nvme/gopath/pkg/mod/gotest.tools/gotestsum@v1.12.0/testjson/format.go:240
 8  0x0000000000655c95 in gotest.tools/gotestsum/testjson.eventFormatterFunc.Format
    at /nvme/gopath/pkg/mod/gotest.tools/gotestsum@v1.12.0/testjson/format.go:413
 9  0x000000000066a295 in gotest.tools/gotestsum/cmd.(*eventHandler).Event
    at /nvme/gopath/pkg/mod/gotest.tools/gotestsum@v1.12.0/cmd/handler.go:48
10  0x000000000064edf5 in gotest.tools/gotestsum/testjson.ScanTestOutput
    at /nvme/gopath/pkg/mod/gotest.tools/gotestsum@v1.12.0/testjson/execution.go:730
11  0x000000000066ed65 in gotest.tools/gotestsum/cmd.run
    at /nvme/gopath/pkg/mod/gotest.tools/gotestsum@v1.12.0/cmd/main.go:284
12  0x000000000066ccf4 in gotest.tools/gotestsum/cmd.Run
    at /nvme/gopath/pkg/mod/gotest.tools/gotestsum@v1.12.0/cmd/main.go:42
13  0x00000000007deddc in main.route
    at /nvme/gopath/pkg/mod/gotest.tools/gotestsum@v1.12.0/main.go:37
14  0x00000000007de8cc in main.main
    at /nvme/gopath/pkg/mod/gotest.tools/gotestsum@v1.12.0/main.go:14
15  0x0000000000443092 in runtime.main
    at /nvme/gopath/go-1.22.3/src/runtime/proc.go:271
16  0x00000000004765e1 in runtime.goexit
    at /nvme/gopath/go-1.22.3/src/runtime/asm_amd64.s:1695
(dlv) p event
gotest.tools/gotestsum/testjson.TestEvent {
	Time: time.Time(0001-01-01T00:00:00Z){
		wall: 0,
		ext: 0,
		loc: *time.Location nil,},
	Action: "fail",
	Package: "",
	Test: "",
	Elapsed: -1,
	Output: "",
	raw: []uint8 len: 0, cap: 0, nil,
	RunID: 0,}

The text was updated successfully, but these errors were encountered:

pohly · 2024-06-14T13:41:31Z

Running with --format standard-quiet on the result file with output was a bit more informative:

*** Test killed with quit: ran too long (11m0s).
exit status 2

=== Failed
=== FAIL: k8s.io/kubernetes/test/integration/scheduler_perf BenchmarkPerfScheduling/SchedulingDaemonset/15000Nodes (0.00s)
--- FAIL: BenchmarkPerfScheduling/SchedulingDaemonset/15000Nodes

=== FAIL: k8s.io/kubernetes/test/integration/scheduler_perf BenchmarkPerfScheduling/SchedulingDaemonset (0.00s)
--- FAIL: BenchmarkPerfScheduling/SchedulingDaemonset
BenchmarkPerfScheduling/SchedulingNodeAffinity
BenchmarkPerfScheduling/SchedulingNodeAffinity/500Nodes
    scheduler_perf.go:742: disabled by label filter "performance"

=== FAIL:   (unknown)
panic: runtime error: invalid memory address or nil pointer dereference

So it looks like a timeout due to go test -timeout produces output that gotestsum doesn't handle.

pohly · 2024-06-15T07:58:53Z

I'm also seeing incorrect "failed" results for benchmarks which completed fine. Could be a bigger issue caused by the difference between unit tests and benchmarks...

dnephin · 2024-06-17T02:34:13Z

Thank you for the bug report!

It looks like the panic is because there is no run event for these benchmarks. I'm not sure why that is, or if that changed in a recent Go release.

The incorrect "failed" results for benchmarks are caused by this same problem. That's been a safeguard to prevent hiding failed test output due to previous test2json bugs.

#416 seems to fix the problem

pohly · 2024-09-10T06:08:17Z

That's been a safeguard to prevent hiding failed test output due to previous test2json bugs.

Is it possible to disable this safeguard? In Kubernetes, I recently changed how we invoke gotestsum and now one of our benchmark jobs is failing, apparently because of this.

pohly · 2024-09-11T10:06:56Z

Summarizing benchmarks worked with Go 1.19:

$ PATH=/nvme/gopath/go-1.19/bin:$PATH gotestsum --jsonfile /tmp/raw.json  -- -bench=BenchmarkDiscardLogInfoOneArg ./benchmark && cat /tmp/raw.json
∅  benchmark (1.193s)

DONE 0 tests in 1.478s
{"Time":"2024-09-11T12:02:34.000797216+02:00","Action":"output","Package":"github.com/go-logr/logr/benchmark","Output":"goos: linux\n"}
{"Time":"2024-09-11T12:02:34.000966654+02:00","Action":"output","Package":"github.com/go-logr/logr/benchmark","Output":"goarch: amd64\n"}
{"Time":"2024-09-11T12:02:34.000977706+02:00","Action":"output","Package":"github.com/go-logr/logr/benchmark","Output":"pkg: github.com/go-logr/logr/benchmark\n"}
{"Time":"2024-09-11T12:02:34.000985507+02:00","Action":"output","Package":"github.com/go-logr/logr/benchmark","Output":"cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz\n"}
{"Time":"2024-09-11T12:02:34.000992851+02:00","Action":"output","Package":"github.com/go-logr/logr/benchmark","Output":"BenchmarkDiscardLogInfoOneArg\n"}
{"Time":"2024-09-11T12:02:35.186994541+02:00","Action":"output","Package":"github.com/go-logr/logr/benchmark","Output":"BenchmarkDiscardLogInfoOneArg-36    \t12116535\t        89.30 ns/op\n"}
{"Time":"2024-09-11T12:02:35.187053001+02:00","Action":"output","Package":"github.com/go-logr/logr/benchmark","Output":"PASS\n"}
{"Time":"2024-09-11T12:02:35.188086017+02:00","Action":"output","Package":"github.com/go-logr/logr/benchmark","Output":"ok  \tgithub.com/go-logr/logr/benchmark\t1.193s\n"}
{"Time":"2024-09-11T12:02:35.188121058+02:00","Action":"pass","Package":"github.com/go-logr/logr/benchmark","Elapsed":1.193}

It broke with 1.20:

$ PATH=/nvme/gopath/go-1.20.2/bin:$PATH gotestsum --jsonfile /tmp/raw.json  -- -bench=BenchmarkDiscardLogInfoOneArg ./benchmark && cat /tmp/raw.json
✓  benchmark (1.265s)

=== Failed
=== FAIL: benchmark BenchmarkDiscardLogInfoOneArg (unknown)
=== RUN   BenchmarkDiscardLogInfoOneArg
BenchmarkDiscardLogInfoOneArg
BenchmarkDiscardLogInfoOneArg-36    	12695464	        91.33 ns/op

DONE 1 tests, 1 failure in 5.620s
{"Time":"2024-09-11T12:03:43.407422644+02:00","Action":"start","Package":"github.com/go-logr/logr/benchmark"}
{"Time":"2024-09-11T12:03:43.412315359+02:00","Action":"output","Package":"github.com/go-logr/logr/benchmark","Output":"goos: linux\n"}
{"Time":"2024-09-11T12:03:43.41235901+02:00","Action":"output","Package":"github.com/go-logr/logr/benchmark","Output":"goarch: amd64\n"}
{"Time":"2024-09-11T12:03:43.412368425+02:00","Action":"output","Package":"github.com/go-logr/logr/benchmark","Output":"pkg: github.com/go-logr/logr/benchmark\n"}
{"Time":"2024-09-11T12:03:43.412375551+02:00","Action":"output","Package":"github.com/go-logr/logr/benchmark","Output":"cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz\n"}
{"Time":"2024-09-11T12:03:43.412385223+02:00","Action":"run","Package":"github.com/go-logr/logr/benchmark","Test":"BenchmarkDiscardLogInfoOneArg"}
{"Time":"2024-09-11T12:03:43.412392388+02:00","Action":"output","Package":"github.com/go-logr/logr/benchmark","Test":"BenchmarkDiscardLogInfoOneArg","Output":"=== RUN   BenchmarkDiscardLogInfoOneArg\n"}
{"Time":"2024-09-11T12:03:43.412399638+02:00","Action":"output","Package":"github.com/go-logr/logr/benchmark","Test":"BenchmarkDiscardLogInfoOneArg","Output":"BenchmarkDiscardLogInfoOneArg\n"}
{"Time":"2024-09-11T12:03:44.671541518+02:00","Action":"output","Package":"github.com/go-logr/logr/benchmark","Test":"BenchmarkDiscardLogInfoOneArg","Output":"BenchmarkDiscardLogInfoOneArg-36    \t12695464\t        91.33 ns/op\n"}
{"Time":"2024-09-11T12:03:44.671596547+02:00","Action":"output","Package":"github.com/go-logr/logr/benchmark","Output":"PASS\n"}
{"Time":"2024-09-11T12:03:44.672577312+02:00","Action":"output","Package":"github.com/go-logr/logr/benchmark","Output":"ok  \tgithub.com/go-logr/logr/benchmark\t1.265s\n"}
{"Time":"2024-09-11T12:03:44.67260672+02:00","Action":"pass","Package":"github.com/go-logr/logr/benchmark","Elapsed":1.2650000000000001}

The difference is the addition of the run line:

{"Time":"2024-09-11T12:03:43.412385223+02:00","Action":"run","Package":"github.com/go-logr/logr/benchmark","Test":"BenchmarkDiscardLogInfoOneArg"}

There's no corresponding pass. This may or may not be a bug in Go - is there a formal specification of the JSON output?

pohly · 2024-09-11T10:16:00Z

I've reported this upstream in golang/go#66825 (comment)

pohly · 2024-09-11T10:35:11Z

It looks like the panic is because there is no run event for these benchmarks. I'm not sure why that is, or if that changed in a recent Go release.

Not sure either. It looks like the opposite: Go 1.20 added such run events. I also cannot reproduce anymore under which circumstances it happens.

After changing how we use gotestsum in Kubernetes, k8s.io/kubernetes/test/integration/scheduler_perf now suffers from the false negative when run is present without a matching pass.

pohly · 2024-09-11T10:51:59Z

After reading through golang/go#40771 I am wondering about where this JSON output comes from. Is test2json some post-processing which converts text output to JSON?

Then it's not surprising that depending on how noisy the test output is, things occasionally break in weird ways.

I don't see the separate pkg/tool/linux_amd64/test2json binary running when using go test -json, but there is a -test.v=test2json for the test binary, so it looks like the code is built in, too?

dnephin · 2024-09-17T03:45:58Z

The output you shared in #413 (comment) is definitely an upstream bug. There's no pass event in there.

Because Go emits a "run" event for a benchmark, but no "passed" (golang/go#66825 (comment)), gotestsum treats the benchmark as failed (gotestyourself/gotestsum#413 (comment) -2172063128) because the events for the test are incomplete. Discussions on how to handle this in gotestsum haven't led to a conclusion and drag on. In the meantime, scheduler_perf jobs keep being treated as failed because of this. By identifying the false positives and removing them, prune-junit-xml can solve the situation for us.

Because Go does not a "pass" action for benchmarks (golang/go#66825 (comment)), gotestsum reports a successful benchmark run as failed (gotestyourself/gotestsum#413 (comment)). We can work around that in each benchmark and sub-benchmark by emitting the output line that `go test` expects on stdout from the test binary for success.

pohly · 2024-11-25T07:41:19Z

I ended up faking the "pass" in the affected benchmark by emitting the output that test2json turns into "pass":
pohly/kubernetes@9099f09

This is not a proper solution. I'm still not sure whether this needs to be fixed in Go or gotestsum is too strict about what it expects.

dnephin linked a pull request Jun 17, 2024 that will close this issue

Fix panic from missing root test #416

Open

alculquicondor mentioned this issue Jun 21, 2024

fix: skip isPodWorthRequeuing only when SchedulingGates gates the pod kubernetes/kubernetes#125527

Merged

pohly mentioned this issue Sep 10, 2024

[Failing Test] Strange ci-benchmark-scheduler-perf-master behavior kubernetes/kubernetes#127245

Closed

pohly mentioned this issue Sep 11, 2024

cmd/test2json: go test -bench -json -count output is inconsistent golang/go#66825

Open

pohly mentioned this issue Sep 11, 2024

tolerate run without pass/failed for benchmarks #438

Open

dnephin added the bug Something isn't working label Sep 13, 2024

pohly mentioned this issue Nov 8, 2024

WIP: prune-junit-xml: fix false positives for benchmarks kubernetes/kubernetes#128702

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

panic when converting "go test -bench" output #413

panic when converting "go test -bench" output #413

pohly commented Jun 14, 2024

pohly commented Jun 14, 2024

pohly commented Jun 15, 2024

dnephin commented Jun 17, 2024

pohly commented Sep 10, 2024

pohly commented Sep 11, 2024

pohly commented Sep 11, 2024

pohly commented Sep 11, 2024

pohly commented Sep 11, 2024

dnephin commented Sep 17, 2024

pohly commented Nov 25, 2024

panic when converting "go test -bench" output #413

panic when converting "go test -bench" output #413

Comments

pohly commented Jun 14, 2024

pohly commented Jun 14, 2024

pohly commented Jun 15, 2024

dnephin commented Jun 17, 2024

pohly commented Sep 10, 2024

pohly commented Sep 11, 2024

pohly commented Sep 11, 2024

pohly commented Sep 11, 2024

pohly commented Sep 11, 2024

dnephin commented Sep 17, 2024

pohly commented Nov 25, 2024