Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

schedules: Schedule Control #51600

Closed
miretskiy opened this issue Jul 20, 2020 · 11 comments
Closed

schedules: Schedule Control #51600

miretskiy opened this issue Jul 20, 2020 · 11 comments
Assignees
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

Comments

@miretskiy
Copy link
Contributor

Implement schedule control statements.

Informs #28266

Similar to how operators can control jobs (PAUSE JOB, RESUME JOB), we need a way to
control schedules.

Schedule control statements look like:
<COMMAND> SCHEDULE(s) <schedule_selector> WITH <job_command> JOBS

Command: is one of PAUSE or RESUME.
schedule_selector: same as jobs selector -- either a list of schedule ids, or a select statement returning schedule ids.
The WITH <job_command> specifies the job command for the jobs managed by this schedule.

PAUSE SCHEDULE 213 WITH CANCEL JOBS
RESUME SCHEDULES SELECT schedule_id FROM system.scheduled_jobs WHERE owner='blah'

It's not clear (yet) if we should support CANCEL SCHEDULES. Is cancel == delete? Then, just delete the schedule.
On the other hand, what if we want to cancel jobs before we delete the schedule? @dt @mwang1026

@miretskiy miretskiy added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Jul 20, 2020
@miretskiy miretskiy self-assigned this Jul 20, 2020
@mwang1026
Copy link

Is the WITH <job_command> part mandatory? If not what is the default?

If we don't support CANCEL SCHEDULES how would one "remove" the schedule?

Do we have a row TTL on the schedules table a-la-Jobs table?

@miretskiy
Copy link
Contributor Author

No, WITH job_command is optional, and the default is do nothing. If there are running jobs, they will continue running.

Removal can be done simply with DELETE FROM

No ttl on schedules

@mwang1026
Copy link

Makes sense re: default -- we just need to know what to doc.

re: DELETE FROM I don't think we should expect users to much around with system tables and that we should have a SQL command. For example, we don't ask people to delete from JOBS table today. Thoughts?

@miretskiy
Copy link
Contributor Author

I don't know ... re delete.
I'm looking at e.g. PAUSE JOBS. Users are expected to specify a query -- so, we are expecting them to specify regular
sql query -- we don't offer syntactic sugar for finding jobs to paus.

PAUSE JOBS SELECT job_id FROM [SHOW JOBS WHERE]....
Why should we provide syntactic sugar for delete then?

Or, alternatively, do we want a more restrictive API where the users can only work on 1 schedule at a time and they have to specify schedule id?

@dt
Copy link
Member

dt commented Jul 21, 2020

I think we should offer DROP SCHEDULE or DELETE SCHEDULE (i prefer drop for symmetry with create) -- we shouldn't encourage users to mutate system tables directly.

@miretskiy
Copy link
Contributor Author

I was just chatting re DROP SCHEDULE w/ @mwang1026 .

Should we offer DROP SCHEDULES as well, as in DROP SCHEDULES SELECT ...?
On one hand -- yes, there is a symmetry.

On the other hand -- looking at job control, these statements return # of rows effected (not the actual rows) -- which
makes it a bit hard to know if what you're pausing/cancelling is really what you meant to pause/cancel.

@miretskiy
Copy link
Contributor Author

Yes, I plan on supporting same thing as we support for job control
So, drop schedules select is supported.

@miretskiy
Copy link
Contributor Author

Yeah -- I wish job control didn't do it and returned actual rows.

However, if you think of job control stuff as just update statement of sort, then it makes sense that we only return
rows effected.

@miretskiy
Copy link
Contributor Author

I'm going to hold off on implement WITH modifiers. There are some issues wrt to statement semantics
(what's the meaning of number of rows modified). Ideally, we would return rows from these statements, but that
was also proving to be more challenging.

For now -- going to support basic PAUSE/RESUME/DROP only.

@mwang1026
Copy link

IMO I think that's sufficient. Users can cancel jobs on their own if need be for v1. IMO there's higher impact work elsewhere to do rn. Thanks for the hard work / thinking here.

craig bot pushed a commit that referenced this issue Jul 27, 2020
51562:  backupccl: add RestoreData processor r=dt a=pbardea

This commit adds a processor which actually performs the ImportRequest.
It has an input which accepts rows with 2 columns that should be sent
from SplitAndScatter processors. Each row represents one span that the
processor should ingest. The intention is that the spans direcected to a
processor on a given node have their leaseholder colocated on the same
node (this work is done in the SplitAndScatter processor). All that
remains is to send a request to ingest the data and stream back its
progress to the coordinator upon completion.

Part of #40239.

Release note: None

51896:  builkio: Add schedule control statements.  r=miretskiy a=miretskiy

Informs #51600 

Introduce schedule control statements responsible for
managing scheduled jobs.

```
PAUSE SCHEDULE 123
PAUSE SCHEDULES SELECT ...

RESUME SCHEDULES SELECT schedule_id FROM system.schedules_jobs ...

DROP SCHEDULE 123
```

Release Notes (enterprise): Implement schedule control statements
to pause, resume, or delete scheduled jobs.

Co-authored-by: Paul Bardea <pbardea@gmail.com>
Co-authored-by: Yevgeniy Miretskiy <yevgeniy@cockroachlabs.com>
@mwang1026
Copy link

Also encapsulates jobs control FOR SCHEDULE

example: PAUSE JOB FOR SCHEDULE xyz

craig bot pushed a commit that referenced this issue Jul 31, 2020
52038: jobs: Implement job control for schedules. r=miretskiy a=miretskiy

Informs  #51600

Add a `FOR SCHEDULES` clause to job control statements
to enable control over jobs created by the scheduled jobs.

```
PAUSE JOBS FOR SCHEDULE 123
RESUME JOBS FOR SCHEDULES (SELECT schedule_id ....)
CANCEL JOBS FOR SCHEDULE 321
```

Release Notes (enterprise change): Add `FOR SCHEDULES` clause to
the job control statements to enable management of the jobs created
by schedules.

Co-authored-by: Yevgeniy Miretskiy <yevgeniy@cockroachlabs.com>
craig bot pushed a commit that referenced this issue Aug 3, 2020
51865: sql: use the descs.Collection to access types during distributed flows r=rohany a=rohany

This commit enables distributed queries to access user defined type
metadata during flow setup via the lease manager, so that accesses to
this metadata is cached and doesn't have to go through k/v on every
access.

This is achieved by giving the `FlowContext` a `descs.Collection` is
used to access the descriptors through the lease manager.

Release note: None

51939: interval/generic: improve randomized testing, fix upper bound bug r=nvanbenschoten a=nvanbenschoten

In an effort to track down the bug that triggered #51913, this commit
ports the randomized interval btree benchmarks to also be unit tests.
This allows us to run invariant checks (see `btree.Verify`) on randomized
tree configurations.

Doing so revealed a violation of the `isUpperBoundCorrect` invariant.
This was determined to be a bug in `node.removeMax`. When removing an
item from a grandchild node, we were failing to adjust the upper bound
of the child node. It doesn't look like this could cause user-visible
effects because the upper bound of a subtree is only ever decreased on
removal, so at worst, this caused searches in the tree to do more work
than strictly necessary. Still, this is a good bug to fix and it's
encouraging that the new randomized testing using the existing invariant
validation caught it.

52090: sql: support ALTER TABLE SET SCHEMA command r=rohany a=RichardJCai

sql: support ALTER TABLE SET SCHEMA command

Release note (sql change): Added support for
ALTER TABLE/SEQUENCE/VIEW SET SCHEMA to set the schema of
the table to the target schema.

One must have DROP privilege on the table and CREATE privilege
on the schema to perform the operation.

52230: bulkio: Implement `SHOW SCHEDULES` r=miretskiy a=miretskiy

Informs #51850
Informs #51600

Implement `SHOW SCHEDULES` statemen which displays the information
on scheduled jobs.

Display schedule information, optionally filtered
by schedule state (paused or not) and optionally restricted
just to the backup schedules:

```
SHOW [RUNNING|PAUSED] SCHEDULES [FOR BACKUP]
```

In addition, it is possible to display information
for a specific schedule:

```
SHOW SCHEDULE 123
```

Release Notes (enterprise change): `SHOW SCHEDULES` displays
information about the scheduled jobs.

Co-authored-by: Rohan Yadav <rohany@alumni.cmu.edu>
Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Co-authored-by: richardjcai <caioftherichard@gmail.com>
Co-authored-by: Yevgeniy Miretskiy <yevgeniy@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Projects
None yet
Development

No branches or pull requests

3 participants