Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cancel pipeline a pipeline run #117

Merged
merged 10 commits into from
Oct 15, 2018

Conversation

Skarlso
Copy link
Member

@Skarlso Skarlso commented Oct 1, 2018

No description provided.

@Skarlso Skarlso added the Needs Work The PR still requires some work from the submitter. label Oct 1, 2018
@Skarlso
Copy link
Member Author

Skarlso commented Oct 1, 2018

Deals with #110

@Skarlso
Copy link
Member Author

Skarlso commented Oct 2, 2018

Uh, wth. Some kind of merge conflict messed up all the worker imports. :/

@codecov-io
Copy link

codecov-io commented Oct 2, 2018

Codecov Report

Merging #117 into master will decrease coverage by 1.21%.
The diff coverage is 18%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #117      +/-   ##
==========================================
- Coverage   64.61%   63.39%   -1.22%     
==========================================
  Files          24       24              
  Lines        2063     2112      +49     
==========================================
+ Hits         1333     1339       +6     
- Misses        580      618      +38     
- Partials      150      155       +5
Impacted Files Coverage Δ
handlers/pipeline_run.go 0% <0%> (ø) ⬆️
handlers/handler.go 85.33% <100%> (+0.19%) ⬆️
workers/scheduler/scheduler.go 69.41% <28.57%> (-4.83%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8eb3f1a...ebe95e8. Read the comment docs.

for _, job := range pr.Jobs {
if job.Status == gaia.JobRunning || job.Status == gaia.JobWaitingExec {
job.Status = gaia.JobFailed
job.FailPipeline = true
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not enough to fail a job. I need to find a way to make jobs that hang cancelled as well. Even if the process is stuck.

@Skarlso
Copy link
Member Author

Skarlso commented Oct 2, 2018

So, this is now working, but it feels a bit wonky. Even though I tested it with multiple runs.

It will end up in cancelled state. It also works with long running processes.

Now I have to write tests. And a bunch of them too.

What do you think so far @michelvocks? :)

screen shot 2018-10-02 at 22 49 19

@michelvocks
Copy link
Member

Good job @Skarlso 👍

A few remarks from side (I just tested it, didn't had a look at the code yet):

  1. The icon used to kill a pipeline run is a bit irritating. We should change that to something that actually mirrors the idea to kill a pipeline run.
  2. The background color of the modal panel (the confirmation) is not globally defined. If you just open Gaia and open the panel it is actually white because the CSS-Class "secret-modal" cannot be found.
  3. The text in the modal panel is black which is hard to read on a dark background.
  4. The logs view is a bit broken because it's not aware of the new state which you have introduced:
    screen shot 2018-10-04 at 19 01 46

Keep up the good work 👍 😄

@Skarlso
Copy link
Member Author

Skarlso commented Oct 5, 2018

@michelvocks Thanks for the review!! :) I'll address these in haste. :)

@Skarlso
Copy link
Member Author

Skarlso commented Oct 5, 2018

Alright, fixed the view. Going to fix the logs next.
screenshot 2018-10-05 at 21 41 00
screenshot 2018-10-05 at 21 40 54

@Skarlso
Copy link
Member Author

Skarlso commented Oct 5, 2018

Aaaand, the log is done as well. That was easier than expected. :D

@michelvocks
Copy link
Member

Looks good to me 🤗

@Skarlso
Copy link
Member Author

Skarlso commented Oct 6, 2018

@michelvocks Thanks. :) It's a hell of a b*tch to test the channels though. :/

@michelvocks
Copy link
Member

@Skarlso Yeah, I'm sorry. 😞 It got a bit messy. I think we have to rework this soon. Still not quite sure how...

@Skarlso
Copy link
Member Author

Skarlso commented Oct 8, 2018

Yeah, I was actually trying to take that apart and trying to come up with a better model. It's just so much work. :D and it's not easy neither. So uh. Don't know man. :D

@Skarlso
Copy link
Member Author

Skarlso commented Oct 11, 2018

@michelvocks yeah I can't seem to be able to write a proper test for the channels. It gets massively ugly. I did manually test it as much as I can. If you don't happen to have any good ideas other then my one thought of actually injecting my own function in the channel that gets called like a callback upon being signalled.

I was maybe thinking something like calling these channels as a struct, something like:

type done struct {
     done chan struct{}
     doneFunc func() bool
 }

And then on done, it cals the doneFunc which would be a callback for the test. That's my only solution to the current situation. :)

@Skarlso
Copy link
Member Author

Skarlso commented Oct 11, 2018

Also. The main reason this is still in WIP is this:

user         13134   0.0  0.2  4481676  26808 s002  S+    9:07pm   0:04.59 /var/folders/yd/7ghbhjz521d5lk76j265_ymw0000gp/T/go-build093401996/b001/exe/main -homepath=/Users/user/golang/src/github.com/gaia-pipeline/gaia/tmp -dev=true
user         13114   0.0  0.2  4405260  31068 s002  S+    9:07pm   0:00.71 go run ./cmd/gaia/main.go -homepath=/Users/user/golang/src/github.com/gaia-pipeline/gaia/tmp -dev=true

The binary is still running if it's in sleep state. Even though the pipeline is cancelled. And I think that's actually not okay. So I'm going to add in some more facility which sends a SIGTERM to the binary of the pipeline.

@michelvocks
Copy link
Member

Actually as soon as Close is called, the plugin process should be killed by go-plugin. It's important to know that the close of the plugin runs asynchronus and it might take a while till the plugin gets killed.

How long did you wait?

@Skarlso
Copy link
Member Author

Skarlso commented Oct 12, 2018

It did not kill it. I opened a ticket on go-plugin. I'll write down some more details on Sunday, but I'm out right now until then.

@Skarlso
Copy link
Member Author

Skarlso commented Oct 14, 2018

Back from vacation. Going to investigate why the plugin did not get killed.

@Skarlso
Copy link
Member Author

Skarlso commented Oct 14, 2018

Alright, let's see... I'm running the pipeline and waiting for a while for go-plugin to shut it down.

hannibal         47896   0.0  0.1  4403772  23448 s002  S+    1:40pm   0:01.95 /var/folders/yd/7ghbhjz521d5lk76j265_ymw0000gp/T/go-build174948883/b001/exe/main -homepath=/Users/hannibal/golang/src/github.com/gaia-pipeline/gaia/tmp -dev=true
hannibal         47857   0.0  0.2  4424240  28600 s002  S+    1:40pm   0:00.98 go run ./cmd/gaia/main.go -homepath=/Users/hannibal/golang/src/github.com/gaia-pipeline/gaia/tmp -dev=true

@Skarlso
Copy link
Member Author

Skarlso commented Oct 14, 2018

@michelvocks It's not killing the plugin. It's running for 11 minutes now, and nothing's happening with it.

The job itself is sleeping. so I think while the healthcheck must be succeeding the job it self is still stuck and thus it doesn't get shut down.

I'm thinking of trying to locate the running process based on the name of the execution binary. Once I have a PID I'll just send a signal to it.

@Skarlso
Copy link
Member Author

Skarlso commented Oct 14, 2018

@michelvocks NEVER MIND! I'm blind. I had a hunch I was looking at the wrong process. Turns out I was. :D

❯ ps aux |grep long-sleep
hannibal         57628   0.0  0.1  4396904  10384 s002  S+    1:57pm   0:00.02 /Users/hannibal/golang/src/github.com/gaia-pipeline/gaia/tmp/pipelines/long-sleep_golang

And this guy is dead before it even hit the floor. After cancel it successfully disappeared.

While I can't really unit test the channels I think this is as done as it gets.

@Skarlso Skarlso added needs review and removed Needs Work The PR still requires some work from the submitter. labels Oct 14, 2018
@Skarlso Skarlso changed the title [WIP] Cancel pipeline a pipeline run Cancel pipeline a pipeline run Oct 14, 2018
Copy link
Member

@michelvocks michelvocks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ❤️

@michelvocks michelvocks merged commit 8b312dd into gaia-pipeline:master Oct 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants