✨ Kill rq jobs by id #509

znatty22 · 2020-11-11T20:13:55Z

Makes use of the new send_kill_horse_command to kill a job via job id

NOTE: This will likely go away once rq/rq#1371 is implemented.

codecov · 2020-11-11T20:17:41Z

Codecov Report

Merging #509 (0cdf063) into master (c5ad853) will increase coverage by 0.07%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #509      +/-   ##
==========================================
+ Coverage   91.70%   91.78%   +0.07%     
==========================================
  Files          74       75       +1     
  Lines        3401     3433      +32     
==========================================
+ Hits         3119     3151      +32     
  Misses        282      282

Impacted Files	Coverage Δ
creator/utils.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c5ad853...3a61591. Read the comment docs.

creator/utils.py

dankolbman · 2020-11-12T00:27:52Z

creator/utils.py

+
+    # Wait until job has been moved to failed job registry to delete or
+    # cancel job
+    while time.time() < (time.time() + WAIT_FOR_FAIL_TIMEOUT):


How long does this operation take and where do we plan to call it from?
The while is going to peg the CPU, possibly of a worker that might otherwise be serving requests, which could cause our autoscaler to kick up the instance count. It will also bombard redis with requests which can be a problem if we end up canceling lots of jobs at once.
Perhaps it would be better to queue another job to check up on the status ?

This operation will be called from the IngestRun.cancel method. Eh yea 30s was left from debugging. I think I'm just going to sleep for less than a second, maybe 0.1 bc I noticed during testing that's all it takes for rq to move the job to a registry.

In general submitting another job to check on status seems like it would have the same problem bc the status might not have changed by the time you're inside the job. In this case it would probably work but I'd rather just sleep. Sigh the real solution is to have rq support subscription to job status changes. This issue looks like what we want: rq/rq#717 but no ones looked at it in a while

creator/utils.py

tests/test_utils.py

dankolbman · 2020-11-12T18:00:34Z

creator/utils.py

+
+    # Wait until job has been moved to failed job registry to delete or
+    # cancel job
+    time.sleep(WAIT_FOR_FAIL_TIME)


What's the impact of never calling delete() or cancel() on the job? It seems likely this could happen.

If you don't call delete the killed jobs will pile up in the failed job registry. I guess it depends on if you care about that and if you want to differentiate between jobs that failed due to kill horse or something else.

For cancel, I just realized I need to change the logic bc it removes jobs from the queue that haven't started. So I don't care about waiting for the failed job reg to update

creator/utils.py

dankolbman · 2020-11-13T22:50:31Z

creator/utils.py

+                worker.get_current_job().id == job_id
+            ):
+                logger.info(f"Killing job {job_id}")
+                send_kill_horse_command(queue.connection, worker.name)


you probably want a break here so you don't end up querying all the workers after you found the job.

dankolbman · 2020-11-13T22:52:39Z

creator/utils.py

+            f"Delete job {job_id} by removing it from queue: "
+            f"{queue.name} and all job registries"
+        )
+        redis_job.delete()


So the job will be deleted even if the send_kill_horse_command was never sent. Is that expected?

I don't think that can happen? If the execution gets down here it means the else block executed bc the job was running, and if its running then it has to be executing in one of the work horses.

But that being said I wonder if the user should have the option to delete the job given the other cases where it either already failed/finished or we cancel it? Right now that doesn't happen because I return at the end of the other if blocks. Or I could just remove deletion entirely and leave it to the user to do that after calling this method?

EDIT: In my last commit I made it so that delete=False by default. But if user sets delete=True the job will always be deleted at the end of stop job. I can change that though if we don't like it

tests/test_utils.py

dankolbman

👍

## Release 1.14.7 ### Summary - Emojis: 🐛 x2, 🔥 x2, ⬆️ x1, ✨ x1 - Categories: Fixes x2, Removals x2, Ops x1, Additions x1 ### New features and changes - [#515](#515) - 🐛 Don't resolve downloadUrls if not enough info - [8dbae2a](8dbae2a) by [dankolbman](https://github.com/dankolbman) - [#514](#514) - 🐛 Fix events query for non-admins - [9f2c9f8](9f2c9f8) by [dankolbman](https://github.com/dankolbman) - [#511](#511) - 🔥 Remove ego authentication - [f803ec5](f803ec5) by [dankolbman](https://github.com/dankolbman) - [#512](#512) - 🔥 Remove hypothesis testing - [09b1a9e](09b1a9e) by [dankolbman](https://github.com/dankolbman) - [#496](#496) - ⬆️ Bump cryptography from 2.5 to 3.2 - [709450f](709450f) by [dependabot[bot]](https://github.com/apps/dependabot) - [#509](#509) - ✨ Kill rq jobs by id - [c747e00](c747e00) by [znatty22](https://github.com/znatty22)

znatty22 added the feature New functionality label Nov 11, 2020

znatty22 self-assigned this Nov 11, 2020

znatty22 force-pushed the kill-jobs branch from c31f373 to b072e91 Compare November 11, 2020 20:14

znatty22 force-pushed the kill-jobs branch 2 times, most recently from edca967 to 111f2dc Compare November 11, 2020 20:30

znatty22 requested review from dankolbman and gsantia November 11, 2020 20:34

znatty22 marked this pull request as ready for review November 11, 2020 20:37

znatty22 changed the title ~~✨ Kill rq jobs~~ ✨ Kill rq jobs by id Nov 11, 2020

znatty22 mentioned this pull request Nov 11, 2020

✨ Add ingest run jobs #506

Merged

gsantia reviewed Nov 11, 2020

View reviewed changes

creator/utils.py Outdated Show resolved Hide resolved

znatty22 requested a review from gsantia November 11, 2020 22:54

dankolbman reviewed Nov 12, 2020

View reviewed changes

znatty22 force-pushed the kill-jobs branch 2 times, most recently from ca7346f to 0a57108 Compare November 12, 2020 16:52

znatty22 requested a review from dankolbman November 12, 2020 17:45

dankolbman reviewed Nov 12, 2020

View reviewed changes

➕ Bump to rq 1.6.1

3812986

znatty22 force-pushed the kill-jobs branch 2 times, most recently from 1cf2923 to 09afa61 Compare November 13, 2020 16:57

znatty22 requested a review from dankolbman November 13, 2020 17:14

znatty22 marked this pull request as draft November 13, 2020 17:18

znatty22 removed request for dankolbman and gsantia November 13, 2020 17:19

znatty22 force-pushed the kill-jobs branch from 09afa61 to 091dac0 Compare November 13, 2020 17:22

znatty22 requested review from dankolbman and gsantia November 13, 2020 17:29

gsantia suggested changes Nov 13, 2020

View reviewed changes

creator/utils.py Outdated Show resolved Hide resolved

creator/utils.py Outdated Show resolved Hide resolved

znatty22 marked this pull request as ready for review November 13, 2020 20:11

znatty22 added 2 commits November 13, 2020 15:14

✨ Kill rq jobs

e47eb05

✅ Test kill jobs

ccabab1

znatty22 force-pushed the kill-jobs branch from 091dac0 to ccabab1 Compare November 13, 2020 20:15

znatty22 requested a review from gsantia November 13, 2020 20:15

dankolbman reviewed Nov 13, 2020

View reviewed changes

znatty22 added 2 commits November 16, 2020 10:59

🔧 Default to delete=False, if delete=True always delete job at end

0cdf063

✅ Rm manual mock reset

3a61591

znatty22 requested a review from dankolbman November 16, 2020 16:10

dankolbman approved these changes Nov 16, 2020

View reviewed changes

gsantia approved these changes Nov 16, 2020

View reviewed changes

znatty22 merged commit c747e00 into master Nov 16, 2020

znatty22 deleted the kill-jobs branch November 16, 2020 17:43

dankolbman mentioned this pull request Nov 20, 2020

🏷 Release 1.14.7 #516

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

✨ Kill rq jobs by id #509

✨ Kill rq jobs by id #509

znatty22 commented Nov 11, 2020 •

edited

Loading

codecov bot commented Nov 11, 2020 •

edited

Loading

dankolbman Nov 12, 2020

znatty22 Nov 12, 2020

dankolbman Nov 12, 2020

znatty22 Nov 12, 2020

dankolbman Nov 13, 2020

znatty22 Nov 16, 2020

dankolbman Nov 13, 2020

znatty22 Nov 16, 2020

znatty22 Nov 16, 2020 •

edited

Loading

dankolbman left a comment

✨ Kill rq jobs by id #509

✨ Kill rq jobs by id #509

Conversation

znatty22 commented Nov 11, 2020 • edited Loading

codecov bot commented Nov 11, 2020 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

znatty22 Nov 16, 2020 • edited Loading

Choose a reason for hiding this comment

dankolbman left a comment

Choose a reason for hiding this comment

znatty22 commented Nov 11, 2020 •

edited

Loading

codecov bot commented Nov 11, 2020 •

edited

Loading

znatty22 Nov 16, 2020 •

edited

Loading