-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cli: add cockroach debug job-trace #65324
Conversation
The first two commits are from #65322. |
Excellent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r1, 2 of 2 files at r2, 4 of 4 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru and @pbardea)
pkg/cli/debug_job_trace.go, line 34 at r3 (raw file):
Args: cobra.MinimumNArgs(2), RunE: MaybeDecorateGRPCError( func(command *cobra.Command, args []string) error {
nit: please move this function to a named function below.
pkg/cli/debug_job_trace.go, line 65 at r3 (raw file):
} jobID, err := strconv.Atoi(jobIDStr)
This check is parameter validation, and should be done upfront in the command before starting to open connections etc.
pkg/cli/debug_job_trace.go, line 71 at r3 (raw file):
var jobQuery = fmt.Sprintf(`SELECT trace_id FROM crdb_internal.jobs WHERE job_id='%d'`, jobID) rows, err := sqlConn.Query(jobQuery, nil)
please pass the query inside sqlConn.Query
and use a placeholder to pass jobID.
pkg/cli/debug_job_trace.go, line 72 at r3 (raw file):
var jobQuery = fmt.Sprintf(`SELECT trace_id FROM crdb_internal.jobs WHERE job_id='%d'`, jobID) rows, err := sqlConn.Query(jobQuery, nil) if err != nil {
I think you need to call rows.Close()
.
Please extract the retrieval of the traceID into a separate function, and ensure rows.Close()
is called within that function.
pkg/cli/debug_job_trace.go, line 103 at r3 (raw file):
var resRows [][]string spanRows, err := sqlConn.Query(query, nil)
ditto calling spanRows.Close()
.
pkg/cli/debug_job_trace.go, line 108 at r3 (raw file):
} populateRow := func(vals []driver.Value) ([]string, error) {
I don't understand why you need this code -- what was wrong with calling runQueryAndFormatResults()
?
pkg/sql/crdb_internal.go, line 623 at r2 (raw file):
error STRING, coordinator_id INT, trace_id INT
nit: spacing is off
7a0e53c
to
d1b11e6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @knz, and @pbardea)
pkg/cli/debug_job_trace.go, line 34 at r3 (raw file):
Previously, knz (kena) wrote…
nit: please move this function to a named function below.
Done.
pkg/cli/debug_job_trace.go, line 65 at r3 (raw file):
Previously, knz (kena) wrote…
This check is parameter validation, and should be done upfront in the command before starting to open connections etc.
Done.
pkg/cli/debug_job_trace.go, line 71 at r3 (raw file):
Previously, knz (kena) wrote…
please pass the query inside
sqlConn.Query
and use a placeholder to pass jobID.
Done.
pkg/cli/debug_job_trace.go, line 72 at r3 (raw file):
Previously, knz (kena) wrote…
I think you need to call
rows.Close()
.
Please extract the retrieval of the traceID into a separate function, and ensurerows.Close()
is called within that function.
Done.
pkg/cli/debug_job_trace.go, line 103 at r3 (raw file):
Previously, knz (kena) wrote…
ditto calling
spanRows.Close()
.
Deleted this code since it is no longer required.
pkg/cli/debug_job_trace.go, line 108 at r3 (raw file):
Previously, knz (kena) wrote…
I don't understand why you need this code -- what was wrong with calling
runQueryAndFormatResults()
?
doh, that's exactly what I want. Thanks for always pointing me to the right place 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good overall, but I'm missing the test code. Can you point me to it?
Reviewed 5 of 11 files at r4, 7 of 7 files at r5.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru and @pbardea)
pkg/cli/debug_job_trace.go, line 72 at r3 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
Done.
nice 💯
pkg/cli/debug_job_trace.go, line 35 at r5 (raw file):
func runDebugJobTrace(_ *cobra.Command, args []string) error { jobID, err := strconv.Atoi(args[0])
Maybe strconv.ParseInt
here and pass it as int64 below. This avoids a cast.
pkg/sql/crdb_internal.go, line 623 at r2 (raw file):
Previously, knz (kena) wrote…
nit: spacing is off
still off
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to tests
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru)
pkg/cli/debug_job_trace.go, line 104 at r5 (raw file):
} var inflightSpanQuery = `WITH spans AS(
nit: formatting of the query looks a bit of
628654a
to
faef8c3
Compare
I'm actually struggling a little while writing a test for this. I've added a WIP commit for the general structure but I'm running into two main issues:
Test is outputting: Local testing is outputting: The latter is what I would expect, any ideas why there is a difference in behavior @knz?
|
you need to use (The default value of |
We have other tests (e.g. for |
faef8c3
to
1df2401
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz and @pbardea)
pkg/cli/debug_job_trace.go, line 35 at r5 (raw file):
Previously, knz (kena) wrote…
Maybe
strconv.ParseInt
here and pass it as int64 below. This avoids a cast.
Done.
pkg/sql/crdb_internal.go, line 623 at r2 (raw file):
Previously, knz (kena) wrote…
still off
fixed the trace_id prefix indent. the col type indentation appears fine in my editor, but not in reviewable.
1df2401
to
076db55
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code structure/style LGTM, but I'll let your team chime in on correctness.
Reviewed 2 of 3 files at r6, 5 of 5 files at r7.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru and @knz)
pkg/sql/crdb_internal.go, line 621 at r6 (raw file):
fraction_completed FLOAT, high_water_timestamp DECIMAL, error STRING,
While you're at it, perhaps you could put that STRING and the INT just below it on the same column as the rest.
076db55
to
e4dbca5
Compare
These only appear shifted in reviewable for some reason. Github and in my IDE the spacing is identical for all the columns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @knz, and @pbardea)
pkg/sql/crdb_internal.go, line 621 at r6 (raw file):
Previously, knz (kena) wrote…
While you're at it, perhaps you could put that STRING and the INT just below it on the same column as the rest.
I'm realizing what the issue is: this vtable definition uses tabs, not spaces, in its schema.
all the other vtables uses spaces - this one is the exception.
So the improvement would be to convert this to use spaces like everything else: two spaces at the beginning, then spaces to align the columns.
e4dbca5
to
89170c9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz and @pbardea)
pkg/sql/crdb_internal.go, line 621 at r6 (raw file):
Previously, knz (kena) wrote…
I'm realizing what the issue is: this vtable definition uses tabs, not spaces, in its schema.
all the other vtables uses spaces - this one is the exception.So the improvement would be to convert this to use spaces like everything else: two spaces at the beginning, then spaces to align the columns.
nice! all done 🙂
my previous review stands
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
8d78069
to
77cb68e
Compare
Had to fix up a few things to get the test passing:
|
77cb68e
to
2871d62
Compare
This change adds a new cli command - `cockroach debug job-trace` that can be used to grab information from inflight trace spans associated with a particular jobs execution. The command takes a `jobID` and file destination where the information will be dumped. It also requires a `--url` of the node from which to pull the inflight spans. In the future we will have a means of pulling inflight spans from the cluster, but at the moment this command must be run against every node to get a complete picture of job execution. Release note (cli change): Adds a `cockroach debug job-trace` command that takes 2 arguments: <jobID> and file destination, alongwith a `--url` pointing to the node on which to execute this command against. The command pulls information about inflight trace spans associated with the job and dumps it to the file destination.
2871d62
to
a7a8da0
Compare
Seeing an |
Going ahead and merging this. The bazel failure is unrelated and seems to have been occurring on most builds since ~3-4 days ago - TFTR! |
Build succeeded: |
This change adds a new cli command -
cockroach debug job-trace
that can be used to grab information from inflight trace spans
associated with the execution of a particular job.
The command takes a
jobID
and file destination where theinformation will be dumped. It also requires a
--url
of thenode from which to pull the inflight spans.
In the future we will have a means of pulling inflight spans from
the cluster, but at the moment this command must be run against every
node to get a complete picture of job execution.
Informs: #64992
Release note (cli change): Adds a
cockroach debug job-trace
commandthat takes 2 arguments: and file destination, along with a
--url
pointing to the node on which to execute this command.The command pulls information about inflight trace spans associated
with the job and dumps it to the file destination.