-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cancel pipeline a pipeline run #117
Conversation
Deals with #110 |
Uh, wth. Some kind of merge conflict messed up all the worker imports. :/ |
Codecov Report
@@ Coverage Diff @@
## master #117 +/- ##
==========================================
- Coverage 64.61% 63.39% -1.22%
==========================================
Files 24 24
Lines 2063 2112 +49
==========================================
+ Hits 1333 1339 +6
- Misses 580 618 +38
- Partials 150 155 +5
Continue to review full report at Codecov.
|
workers/scheduler/scheduler.go
Outdated
for _, job := range pr.Jobs { | ||
if job.Status == gaia.JobRunning || job.Status == gaia.JobWaitingExec { | ||
job.Status = gaia.JobFailed | ||
job.FailPipeline = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not enough to fail a job. I need to find a way to make jobs that hang cancelled as well. Even if the process is stuck.
So, this is now working, but it feels a bit wonky. Even though I tested it with multiple runs. It will end up in cancelled state. It also works with long running processes. Now I have to write tests. And a bunch of them too. What do you think so far @michelvocks? :) |
Good job @Skarlso 👍 A few remarks from side (I just tested it, didn't had a look at the code yet):
Keep up the good work 👍 😄 |
@michelvocks Thanks for the review!! :) I'll address these in haste. :) |
Aaaand, the log is done as well. That was easier than expected. :D |
Looks good to me 🤗 |
@michelvocks Thanks. :) It's a hell of a b*tch to test the channels though. :/ |
@Skarlso Yeah, I'm sorry. 😞 It got a bit messy. I think we have to rework this soon. Still not quite sure how... |
Yeah, I was actually trying to take that apart and trying to come up with a better model. It's just so much work. :D and it's not easy neither. So uh. Don't know man. :D |
@michelvocks yeah I can't seem to be able to write a proper test for the channels. It gets massively ugly. I did manually test it as much as I can. If you don't happen to have any good ideas other then my one thought of actually injecting my own function in the channel that gets called like a callback upon being signalled. I was maybe thinking something like calling these channels as a struct, something like: type done struct {
done chan struct{}
doneFunc func() bool
} And then on done, it cals the doneFunc which would be a callback for the test. That's my only solution to the current situation. :) |
Also. The main reason this is still in WIP is this:
The binary is still running if it's in sleep state. Even though the pipeline is cancelled. And I think that's actually not okay. So I'm going to add in some more facility which sends a SIGTERM to the binary of the pipeline. |
Actually as soon as How long did you wait? |
It did not kill it. I opened a ticket on go-plugin. I'll write down some more details on Sunday, but I'm out right now until then. |
Back from vacation. Going to investigate why the plugin did not get killed. |
Alright, let's see... I'm running the pipeline and waiting for a while for go-plugin to shut it down.
|
@michelvocks It's not killing the plugin. It's running for 11 minutes now, and nothing's happening with it. The job itself is sleeping. so I think while the healthcheck must be succeeding the job it self is still stuck and thus it doesn't get shut down. I'm thinking of trying to locate the running process based on the name of the execution binary. Once I have a PID I'll just send a signal to it. |
@michelvocks NEVER MIND! I'm blind. I had a hunch I was looking at the wrong process. Turns out I was. :D
And this guy is dead before it even hit the floor. After cancel it successfully disappeared. While I can't really unit test the channels I think this is as done as it gets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ❤️
No description provided.