Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ui,jobs: improve jobs overview page in DBConsole #68179

Closed
vy-ton opened this issue Jul 28, 2021 · 39 comments · Fixed by #72291
Closed

ui,jobs: improve jobs overview page in DBConsole #68179

vy-ton opened this issue Jul 28, 2021 · 39 comments · Fixed by #72291
Assignees
Labels
A-jobs A-sql-console-general SQL Observability issues on the DB console spanning multiple areas. Includes Cockroach Cloud Console A-webui-jobs-and-events C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

Comments

@vy-ton
Copy link
Contributor

vy-ton commented Jul 28, 2021

In #44594, Schema is adding a retry mechanism to the jobs infrastructure. During this work, we plan to add additional columns to crdb_internal.jobs to surface more job metrics.

These new metrics would be helpful in the DBConsole Jobs Overview page.

  • last execution time
  • execution count
  • updated status to indicate Retrying

FYI @ajwerner @sajjadrizvi

Epic CRDB-7912

@vy-ton vy-ton added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-sql-console-general SQL Observability issues on the DB console spanning multiple areas. Includes Cockroach Cloud Console labels Jul 28, 2021
@kevin-v-ngo
Copy link

Hi @vy-ton, @maryliag and I were wondering when the additional job metrics will be available so that we can coordinate the UX changes. Do you have an ETA?

@ajwerner
Copy link
Contributor

ajwerner commented Aug 9, 2021

cc @sajjadrizvi

@sajjadrizvi
Copy link

Hi @vy-ton, @maryliag and I were wondering when the additional job metrics will be available so that we can coordinate the UX changes. Do you have an ETA?

I am still working on adding the exponential backoff that is a pre-requisite for job metrics. I am expecting that it will take another day or two to complete. After that I can start working on adding the required metrics for observability. So I think the ETA is by the end of next week.

@sajjadrizvi
Copy link

I am thinking about the following.

We add three columns, last execution time, next execution time, and, number of times executed, that provide quick insights about the job and its current status.

In addition, we add a column that provides insights about each job's lifecycle. A user should be able to see that a job has transitioned from state_x to state_y at time_t due to error err. Moreover, what will happen next to the job? If it is going to be retried, when will it be?

To achieve that, we can add a repeated structure in jobs proto that keeps track of job history:

ExecutionLog struct {
  state
  execution_time
  execution_error
}

Each time a job runs stepThroughStateMachine, it create this structure and adds it in job's payload after populating state and execution_time fields. When the job finishes its current execution (before leaving stepThroughStateMachine), it populates the execution_error field with the last error.

This structure can be used to populate a column job status log that shows the list of those structures in reverse chronological order, in the form "Job attempt at execution_time with status state finished with error execution_error. If there was no error, the error part can be skipped. If the job will be retried, we should add "The job will be retried at time. May be we want to add next_state as well if it doesn't add too much complication.

A question is, how many log entries to show to a user? I think we should show only one log entry by default and provide an option to display N entries in reverse chronological order.

@ajwerner
Copy link
Contributor

Can you be more concrete as to when these entries are actually written? I'm having a hard time understanding what happens before Resume/OnFailOrCancel and what happens after in terms of writes to the jobs table.

Also, let's isolate somewhat the discussion about what values we write to system.jobs from the implied changes to crdb_internal.jobs. The latter is a function of the former, so they are related.

@sajjadrizvi
Copy link

Currently a resumer runs in states running and reverting. In running state, we add this structure with nil execution_error before resuming. execution_error field is then populated here if there is an error.

In reverting state, we should add the structure with nil error before reverting. Error is then populated after onFailOrCancel.

In the cases of StatusSuceeded, StatusFailed, and StatusCanceled, we do not run the resumer. So we don't have to add those log entries.

@ajwerner
Copy link
Contributor

That sounds good. In addition, I think it'd be nice to have the coordinator ID. It'd be good to have an invariant that when we write last_run then we populate an entry in the log.

@sajjadrizvi
Copy link

sajjadrizvi commented Aug 16, 2021

In addition, I think it'd be nice to have the coordinator ID

Great, that's a good suggestion.

It'd be good to have an invariant that when we write last_run then we populate an entry in the log.

That's precisely what we want.

@sajjadrizvi
Copy link

That's precisely what we want.

May be not! There is an exception. We update last_run in servePauseAndCancel as well. I think we should not add the entry at that time. So the invariant could be to populate an entry whenever we run a resumer.

@sajjadrizvi
Copy link

sajjadrizvi commented Aug 17, 2021

Another question is how to add the transition logs in the jobs table? On top of my head, I am thinking to add a column transition_logs. Each row then presents the logs as a string:

Log number: 1
Coordinator ID: node_id
Execution time: ts
Execution state: status
Execution error: nil or decoded error string
Log number: 2
...

An alternative is to have a separate table with jobID and four columns for each log entry. I think that's not the best way to go.

@ajwerner
Copy link
Contributor

Firstly, when you say jobs table, please indicate which jobs table.

I don't think separate rows is going to leave to a happy outcome. My guess is the right answer is to use json.

@sajjadrizvi
Copy link

OK, I meant, crdb_internal.jobs. Sorry for the confusion.

@sajjadrizvi
Copy link

Also, I wanted to say that a row has. a string that contains the list of the entries. JSON seems appropriate here.

sajjadrizvi pushed a commit to sajjadrizvi/cockroach that referenced this issue Aug 17, 2021
This commit adds tranistion_logs columns in crdb_internal.jobs
table. Moreover, it adds tests to validate the correctness of
values accessed through crdb_internal.jobs.

Release note: None

Fixes: cockroachdb#68179
sajjadrizvi pushed a commit to sajjadrizvi/cockroach that referenced this issue Aug 18, 2021
This commit adds transition_logs column in crdb_internal.jobs
table. Moreover, it adds tests to validate the correctness of
values accessed through crdb_internal.jobs.

Release note: None

Fixes: cockroachdb#68179
@kevin-v-ngo
Copy link

Synced with Andrew and for this issue, we'll track adding the 3 metrics (LAST EXECUTION TIME, NEXT EXECUTION TIME, and EXECUTION COUNT to the jobs overview page.

For error details, we'll track it with the following issue: #69170.

CC'ing @Annebirzin @vy-ton as FYI

sajjadrizvi pushed a commit to sajjadrizvi/cockroach that referenced this issue Aug 20, 2021
This commit adds transition_logs column in crdb_internal.jobs
table. Moreover, it adds tests to validate the correctness of
values accessed through crdb_internal.jobs.

Release note: None

Fixes: cockroachdb#68179
sajjadrizvi pushed a commit to sajjadrizvi/cockroach that referenced this issue Aug 22, 2021
This commit adds transition_logs column in crdb_internal.jobs
table. Moreover, it adds tests to validate the correctness of
values accessed through crdb_internal.jobs.

Release note: None

Fixes: cockroachdb#68179
craig bot pushed a commit that referenced this issue Aug 24, 2021
68995: sql: add columns in jobs virtual table for overview in DBConsole r=ajwerner a=sajjadrizvi

This commit adds new columns in `crdb_internal.jobs` table, which
show the current exponential-backoff state of a job and its execution
history.

Release justification: This commit adds low-risk updates to new
functionality. Jobs subsystem now supports job retries with
exponential-backoff. We want to give users more insights
about the backoff state of jobs and jobs' lifecycles through
additional columns in `crdb_internal.jobs` table.

Release note (general change): The functionality to retry failed
jobs with exponential-backoff has introduced recently in the system.
This commit adds new columns in `crdb_internal.jobs` table, which
show the current backoff-state of a job and its execution log. The
execution log consists of a sequence of job start and end events
and any associated errors that were encountered during the job's
each execution. Now users can query internal jobs table to get
more insights about jobs through the following columns: (a) `last_run`
shows the last execution time of a job, (b) `next_run` shows the
next execution time of a job based on exponential-backoff delay,
(c) `num_runs` shows the number of times the job has been executed,
and (d) `execution_log` provides a set of events that are generated
when a job starts and ends its execution.

Relates to #68179

69044: storageccl: remove non-ReturnSST ExportRequest r=dt a=dt

Release justification: bug fix in new functionality.

Release note: none.

69239: roachtest: move roachtest stress CI job instructions to README r=tbg,stevendanna a=erikgrinaker

Release justification: non-production code changes
Release note: None

69285: roachtest: increase consistency check timeout, and ignore errors r=tbg a=erikgrinaker

This bumps the consistency check timeout to 5 minutes. There are
indications that a recent libpq upgrade unmasked previously ignored
context cancellation errors, caused by the timeout here being too low.
It also ignores errors during the consistency check, since it is
best-effort anyway.

Resolves #68883.

Release justification: non-production code changes
Release note: None

Co-authored-by: Sajjad Rizvi <sajjad@cockroachlabs.com>
Co-authored-by: David Taylor <tinystatemachine@gmail.com>
Co-authored-by: Erik Grinaker <grinaker@cockroachlabs.com>
@jocrl
Copy link
Contributor

jocrl commented Oct 26, 2021

Filed https://github.com/cockroachdb/ui/issues/395 because the Tooltip component is mis-centered for inline elements (in this case, the status badge), and because the solution is move involved than I had thought.

edit: The new design has a pretty full table cell, so I centered the tooltip around the entire cell and so the bug no longer blocks this issue

@jocrl
Copy link
Contributor

jocrl commented Oct 26, 2021

@Annebirzin , just want to re-ping you about how we should indicate that a job is running/reverting, but also retrying? I like Marylia's suggestion about the hyphenated statuses, though that raises the question of what to do about the % and timing remaining visualization that would usually show on running jobs.

The definition for a job that is retrying is status IN ('running', 'reverting') AND next_run > now() AND num_runs > 1 (i.e. it will be both running and retrying, or reverting and retrying)

image

@ajwerner , assuming you have no objections, I'm going to modify the endpoint as Marylia suggested (#68179 (comment)) to send retry-running and retry-reverting in the endpoint status field.

@Annebirzin
Copy link

@jocrl I wonder if for retrying-running instead of showing the status badge, we show the % bar with a 'retrying' label before the time remaining? (Figma design)

Screen Shot 2021-10-26 at 6 08 07 PM

Does that make sense?

@jocrl
Copy link
Contributor

jocrl commented Oct 27, 2021

@Annebirzin That makes sense! For consistency, do you think retry-running should be something like the lower two rows? Or just the first is good.
image

@Annebirzin
Copy link

@jocrl I do like the last row where they look like two separate badges. Maybe just a bit of space between them and remove the :

@jocrl
Copy link
Contributor

jocrl commented Oct 28, 2021

@ajwerner if num_runs is null in the crdb_internal.jobs, should it show in the UI as 0, or just blank? (The endpoint defaults to 0 without further intervention).

jocrl added a commit to jocrl/cockroach that referenced this issue Nov 4, 2021
… in the DBConsole Jobs Overview page

Fixes cockroachdb#68179

[wip] tests are still work in progress. Just wanted to get people's
thoughts in the meantime! Only tests are missing

This commit surfaces the status `reverting`, annotates existing `running` and
`reverting` statuses UI with "retrying" where applicable, and adds the "Last
Execution Time (UTC)" and "Execution Count" columns to the jobs overview table
in db console. "Retrying" is defined as `status IN ('running', 'reverting') AND
next_run > now() AND num_runs > 1`.

Hovering a retrying status shows the next execution time. The "Status" column
was also moved left to the second column. Filtering using the dropdown by
`Status: Running` or `Status: Reverting` will include those that are
also "retrying". Users can also filter by `Status: Retrying`.

The `/jobs` endpoint was modified to add the `last_run`, `next_run`, and
`num_runs` fields required for the UI change. Jobs with status `running` or
`reverting` and are also "retrying" have their statuses sent as `retry-running`
and `retry-reverting` respectively. The endpoint was also modified to support
the value `retrying` for the `status` query parameter.

This commit also adds a storybook story for the jobs table, which showcases the
different possible statuses in permutations of information that could be
present for the `running` status.

Release note (ui change): The jobs overview table in DBConsole now shows when
jobs have the status "reverting", and shows the badge "retrying" when running
or reverting jobs are also retrying. Hovering the status for a "retrying" job
will show the "Next execution time" in UTC. Two new columns, "Last Execution
Time (UTC)" and "Execution Count", were also added to the jobs overview table
in DBConsole, and the "Status" column was moved left to the second column in
the table.

The `status` query parameter in the `/jobs` endpoint now supports the values
`reverting` and `retrying`.
jocrl added a commit to jocrl/cockroach that referenced this issue Nov 10, 2021
… in the DBConsole Jobs Overview page

Fixes cockroachdb#68179

[wip] tests are still work in progress. Just wanted to get people's
thoughts in the meantime! Only tests are missing

This commit surfaces the status `reverting`, annotates existing `running` and
`reverting` statuses UI with "retrying" where applicable, and adds the "Last
Execution Time (UTC)" and "Execution Count" columns to the jobs overview table
in db console. "Retrying" is defined as `status IN ('running', 'reverting') AND
next_run > now() AND num_runs > 1`.

Hovering a retrying status shows the next execution time. The "Status" column
was also moved left to the second column. Filtering using the dropdown by
`Status: Running` or `Status: Reverting` will include those that are
also "retrying". Users can also filter by `Status: Retrying`.

The `/jobs` endpoint was modified to add the `last_run`, `next_run`, and
`num_runs` fields required for the UI change. Jobs with status `running` or
`reverting` and are also "retrying" have their statuses sent as `retry-running`
and `retry-reverting` respectively. The endpoint was also modified to support
the value `retrying` for the `status` query parameter.

This commit also adds a storybook story for the jobs table, which showcases the
different possible statuses in permutations of information that could be
present for the `running` status.

Release note (ui change): The jobs overview table in DBConsole now shows when
jobs have the status "reverting", and shows the badge "retrying" when running
or reverting jobs are also retrying. Hovering the status for a "retrying" job
will show the "Next execution time" in UTC. Two new columns, "Last Execution
Time (UTC)" and "Execution Count", were also added to the jobs overview table
in DBConsole, and the "Status" column was moved left to the second column in
the table.

The `status` query parameter in the `/jobs` endpoint now supports the values
`reverting` and `retrying`.
jocrl added a commit to jocrl/cockroach that referenced this issue Nov 11, 2021
… in the DBConsole Jobs Overview page

Fixes cockroachdb#68179

This commit surfaces the status `reverting`, annotates existing `running` and
`reverting` statuses UI with "retrying" where applicable, and adds the "Last
Execution Time (UTC)" and "Execution Count" columns to the jobs overview table
in db console. "Retrying" is defined as `status IN ('running', 'reverting') AND
next_run > now() AND num_runs > 1`.

Hovering a retrying status shows the next execution time. The "Status" column
was also moved left to the second column. Filtering using the dropdown by
`Status: Running` or `Status: Reverting` will include those that are
also "retrying". Users can also filter by `Status: Retrying`.

The `/jobs` endpoint was modified to add the `last_run`, `next_run`, and
`num_runs` fields required for the UI change. Jobs with status `running` or
`reverting` and are also "retrying" have their statuses sent as `retry-running`
and `retry-reverting` respectively. The endpoint was also modified to support
the value `retrying` for the `status` query parameter.

This commit also adds a storybook story for the jobs table, which showcases the
different possible statuses in permutations of information that could be
present for the `running` status.

Release note (ui change): The jobs overview table in DBConsole now shows when
jobs have the status "reverting", and shows the badge "retrying" when running
or reverting jobs are also retrying. Hovering the status for a "retrying" job
will show the "Next execution time" in UTC. Two new columns, "Last Execution
Time (UTC)" and "Execution Count", were also added to the jobs overview table
in DBConsole, and the "Status" column was moved left to the second column in
the table.

The `status` query parameter in the `/jobs` endpoint now supports the values
`reverting` and `retrying`.
jocrl added a commit to jocrl/cockroach that referenced this issue Nov 11, 2021
… in the DBConsole Jobs Overview page

Fixes cockroachdb#68179

This commit surfaces the status `reverting`, annotates existing `running` and
`reverting` statuses UI with "retrying" where applicable, and adds the "Last
Execution Time (UTC)" and "Execution Count" columns to the jobs overview table
in db console. "Retrying" is defined as `status IN ('running', 'reverting') AND
next_run > now() AND num_runs > 1`.

Hovering a retrying status shows the next execution time. The "Status" column
was also moved left to the second column. Filtering using the dropdown by
`Status: Running` or `Status: Reverting` will include those that are
also "retrying". Users can also filter by `Status: Retrying`.

The `/jobs` endpoint was modified to add the `last_run`, `next_run`, and
`num_runs` fields required for the UI change. Jobs with status `running` or
`reverting` and are also "retrying" have their statuses sent as `retry-running`
and `retry-reverting` respectively. The endpoint was also modified to support
the value `retrying` for the `status` query parameter.

This commit also adds a storybook story for the jobs table, which showcases the
different possible statuses in permutations of information that could be
present for the `running` status.

Release note (ui change): The jobs overview table in DBConsole now shows when
jobs have the status "reverting", and shows the badge "retrying" when running
or reverting jobs are also retrying. Hovering the status for a "retrying" job
will show the "Next execution time" in UTC. Two new columns, "Last Execution
Time (UTC)" and "Execution Count", were also added to the jobs overview table
in DBConsole, and the "Status" column was moved left to the second column in
the table.

The `status` query parameter in the `/jobs` endpoint now supports the values
`reverting` and `retrying`.
jocrl added a commit to jocrl/cockroach that referenced this issue Nov 11, 2021
… in the DBConsole Jobs Overview page

Fixes cockroachdb#68179

This commit surfaces the status `reverting`, annotates existing `running` and
`reverting` statuses UI with "retrying" where applicable, and adds the "Last
Execution Time (UTC)" and "Execution Count" columns to the jobs overview table
in db console. "Retrying" is defined as `status IN ('running', 'reverting') AND
next_run > now() AND num_runs > 1`.

Hovering a retrying status shows the next execution time. The "Status" column
was also moved left to the second column. Filtering using the dropdown by
`Status: Running` or `Status: Reverting` will include those that are
also "retrying". Users can also filter by `Status: Retrying`.

The `/jobs` endpoint was modified to add the `last_run`, `next_run`, and
`num_runs` fields required for the UI change. Jobs with status `running` or
`reverting` and are also "retrying" have their statuses sent as `retry-running`
and `retry-reverting` respectively. The endpoint was also modified to support
the value `retrying` for the `status` query parameter.

This commit also adds a storybook story for the jobs table, which showcases the
different possible statuses in permutations of information that could be
present for the `running` status.

Release note (ui change): The jobs overview table in DBConsole now shows when
jobs have the status "reverting", and shows the badge "retrying" when running
or reverting jobs are also retrying. Hovering the status for a "retrying" job
will show the "Next execution time" in UTC. Two new columns, "Last Execution
Time (UTC)" and "Execution Count", were also added to the jobs overview table
in DBConsole, and the "Status" column was moved left to the second column in
the table.

The `status` query parameter in the `/jobs` endpoint now supports the values
`reverting` and `retrying`.
jocrl added a commit to jocrl/cockroach that referenced this issue Nov 29, 2021
… in the

DBConsole Jobs Overview page

Fixes cockroachdb#68179

This commit surfaces the status `reverting`, annotates existing `running` and
`reverting` statuses UI with "retrying" where applicable, and adds the "Last
Execution Time (UTC)" and "Execution Count" columns to the jobs overview table
in db console. "Retrying" is defined as `status IN ('running', 'reverting') AND
next_run > now() AND num_runs > 1`.

Hovering a retrying status shows the next execution time. The "Status" column
was also moved left to the second column. Filtering using the dropdown by
`Status: Running` or `Status: Reverting` will include those that are
also "retrying". Users can also filter by `Status: Retrying`.

The `/jobs` endpoint was modified to add the `last_run`, `next_run`, and
`num_runs` fields required for the UI change. Jobs with status `running` or
`reverting` and are also "retrying" have their statuses sent as `retry-running`
and `retry-reverting` respectively. The endpoint was also modified to support
the value `retrying` for the `status` query parameter.

This commit also adds a storybook story for the jobs table, which showcases the
different possible statuses in permutations of information that could be
present for the `running` status.

Release note (ui change): The jobs overview table in DBConsole now shows when
jobs have the status "reverting", and shows the badge "retrying" when running
or reverting jobs are also retrying. Hovering the status for a "retrying" job
will show the "Next execution time" in UTC. Two new columns, "Last Execution
Time (UTC)" and "Execution Count", were also added to the jobs overview table
in DBConsole, and the "Status" column was moved left to the second column in
the table.

The `status` query parameter in the `/jobs` endpoint now supports the values
`reverting` and `retrying`.
jocrl added a commit to jocrl/cockroach that referenced this issue Nov 30, 2021
… in the

DBConsole Jobs Overview page

Fixes cockroachdb#68179

This commit surfaces the status `reverting`, annotates existing `running` and
`reverting` statuses UI with "retrying" where applicable, and adds the "Last
Execution Time (UTC)" and "Execution Count" columns to the jobs overview table
in db console. "Retrying" is defined as `status IN ('running', 'reverting') AND
next_run > now() AND num_runs > 1`.

Hovering a retrying status shows the next execution time. The "Status" column
was also moved left to the second column. Filtering using the dropdown by
`Status: Running` or `Status: Reverting` will include those that are
also "retrying". Users can also filter by `Status: Retrying`.

The `/jobs` endpoint was modified to add the `last_run`, `next_run`, and
`num_runs` fields required for the UI change. Jobs with status `running` or
`reverting` and are also "retrying" have their statuses sent as `retry-running`
and `retry-reverting` respectively. The endpoint was also modified to support
the value `retrying` for the `status` query parameter.

This commit also adds a storybook story for the jobs table, which showcases the
different possible statuses in permutations of information that could be
present for the `running` status.

Release note (ui change): The jobs overview table in DBConsole now shows when
jobs have the status "reverting", and shows the badge "retrying" when running
or reverting jobs are also retrying. Hovering the status for a "retrying" job
will show the "Next execution time" in UTC. Two new columns, "Last Execution
Time (UTC)" and "Execution Count", were also added to the jobs overview table
in DBConsole, and the "Status" column was moved left to the second column in
the table.

The `status` query parameter in the `/jobs` endpoint now supports the values
`reverting` and `retrying`.
craig bot pushed a commit that referenced this issue Dec 8, 2021
72291: ui/db-console: surface more job metrics around reverting and retrying in the DBConsole Jobs Overview page r=jocrl a=jocrl

Fixes #68179

This commit surfaces the status `reverting`, annotates existing `running` and
`reverting` statuses UI with "retrying" where applicable, and adds the "Last
Execution Time (UTC)" and "Execution Count" columns to the jobs overview table
in db console. "Retrying" is defined as `status IN ('running', 'reverting') AND
next_run > now() AND num_runs > 1`.

Hovering a retrying status shows the next execution time. The "Status" column
was also moved left to the second column. Filtering using the dropdown by
`Status: Running` or `Status: Reverting` will include those that are
also "retrying". Users can also filter by `Status: Retrying`.

The `/jobs` endpoint was modified to add the `last_run`, `next_run`, and
`num_runs` fields required for the UI change. Jobs with status `running` or
`reverting` and are also "retrying" have their statuses sent as `retry-running`
and `retry-reverting` respectively. The endpoint was also modified to support
the value `retrying` for the `status` query parameter.

This commit also adds a storybook story for the jobs table, which showcases the
different possible statuses in permutations of information that could be
present for the `running` status.

Release note (ui change): The jobs overview table in DBConsole now shows when
jobs have the status "reverting", and shows the badge "retrying" when running
or reverting jobs are also retrying. Hovering the status for a "retrying" job
will show the "Next execution time" in UTC. Two new columns, "Last Execution
Time (UTC)" and "Execution Count", were also added to the jobs overview table
in DBConsole, and the "Status" column was moved left to the second column in
the table.

The `status` query parameter in the `/jobs` endpoint now supports the values
`reverting` and `retrying`.

Jobs table:
<img width="1602" alt="image" src="https://user-images.githubusercontent.com/91907326/141374430-bfad72de-aa2d-4cbb-98ef-62ddf5f98f4a.png">


Filter and hover:
https://user-images.githubusercontent.com/91907326/141375153-2cf2641a-33a1-4bfb-a900-a187dc5579a1.mov

Permutations of running jobs with present/absent combinations of time remaining, running message, or retrying:
<img width="979" alt="image" src="https://user-images.githubusercontent.com/91907326/141374527-124a86c0-d10d-451f-b8dc-f745d52fe6d4.png">


Co-authored-by: Josephine Lee <josephine@cockroachlabs.com>
@craig craig bot closed this as completed in f0d0467 Dec 8, 2021
jocrl added a commit to jocrl/cockroach that referenced this issue Dec 8, 2021
… in the

DBConsole Jobs Overview page

Fixes cockroachdb#68179

This commit surfaces the status `reverting`, annotates existing `running` and
`reverting` statuses UI with "retrying" where applicable, and adds the "Last
Execution Time (UTC)" and "Execution Count" columns to the jobs overview table
in db console. "Retrying" is defined as `status IN ('running', 'reverting') AND
next_run > now() AND num_runs > 1`.

Hovering a retrying status shows the next execution time. The "Status" column
was also moved left to the second column. Filtering using the dropdown by
`Status: Running` or `Status: Reverting` will include those that are
also "retrying". Users can also filter by `Status: Retrying`.

The `/jobs` endpoint was modified to add the `last_run`, `next_run`, and
`num_runs` fields required for the UI change. Jobs with status `running` or
`reverting` and are also "retrying" have their statuses sent as `retry-running`
and `retry-reverting` respectively. The endpoint was also modified to support
the value `retrying` for the `status` query parameter.

This commit also adds a storybook story for the jobs table, which showcases the
different possible statuses in permutations of information that could be
present for the `running` status.

Release note (ui change): The jobs overview table in DBConsole now shows when
jobs have the status "reverting", and shows the badge "retrying" when running
or reverting jobs are also retrying. Hovering the status for a "retrying" job
will show the "Next execution time" in UTC. Two new columns, "Last Execution
Time (UTC)" and "Execution Count", were also added to the jobs overview table
in DBConsole, and the "Status" column was moved left to the second column in
the table.

The `status` query parameter in the `/jobs` endpoint now supports the values
`reverting` and `retrying`.
jocrl added a commit to jocrl/cockroach that referenced this issue Dec 10, 2021
… in the

DBConsole Jobs Overview page

Fixes cockroachdb#68179

This commit surfaces the status `reverting`, annotates existing `running` and
`reverting` statuses UI with "retrying" where applicable, and adds the "Last
Execution Time (UTC)" and "Execution Count" columns to the jobs overview table
in db console. "Retrying" is defined as `status IN ('running', 'reverting') AND
next_run > now() AND num_runs > 1`.

Hovering a retrying status shows the next execution time. The "Status" column
was also moved left to the second column. Filtering using the dropdown by
`Status: Running` or `Status: Reverting` will include those that are
also "retrying". Users can also filter by `Status: Retrying`.

The `/jobs` endpoint was modified to add the `last_run`, `next_run`, and
`num_runs` fields required for the UI change. Jobs with status `running` or
`reverting` and are also "retrying" have their statuses sent as `retry-running`
and `retry-reverting` respectively. The endpoint was also modified to support
the value `retrying` for the `status` query parameter.

This commit also adds a storybook story for the jobs table, which showcases the
different possible statuses in permutations of information that could be
present for the `running` status.

Release note (ui change): The jobs overview table in DBConsole now shows when
jobs have the status "reverting", and shows the badge "retrying" when running
or reverting jobs are also retrying. Hovering the status for a "retrying" job
will show the "Next execution time" in UTC. Two new columns, "Last Execution
Time (UTC)" and "Execution Count", were also added to the jobs overview table
in DBConsole, and the "Status" column was moved left to the second column in
the table.

The `status` query parameter in the `/jobs` endpoint now supports the values
`reverting` and `retrying`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-jobs A-sql-console-general SQL Observability issues on the DB console spanning multiple areas. Includes Cockroach Cloud Console A-webui-jobs-and-events C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Projects
None yet
7 participants