Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wings: Fix stoplogic #31

Merged
merged 9 commits into from
Oct 26, 2024
Merged

Wings: Fix stoplogic #31

merged 9 commits into from
Oct 26, 2024

Conversation

QuintenQVD0
Copy link
Contributor

@QuintenQVD0 QuintenQVD0 commented Jun 16, 2024

Changes

The signal must be a string so "SIGTERM" but we used to pass to syscall.SIGTERM and trim some parts but that does not work.
so in the function, all incoming signals are converted to signals in string format. They also get matched with their corresponding timeout. What means for any signal the default timeout before it gets killed is 10 seconds. So:

  1. The container gets the signal
  2. it sends the signal to the container
  3. if it exitis in those 10 seconds then it will just continue
  4. if not after 10 seconds, SIGKILL is sent

The timeout is 0 for os.kill as that is what the kill button sends and then it should stop immediately .

this is combined with my panel PR so ^C send SIGTERM.

This also makes wine eggs not need anymore for ^^C but wine can not wait it will just direct exit (how it also does it with ^^C)

you can also now do as stop ^SIGINT, ^SIGTERM, ^SIGABRT what will be converted in to a signal.

This would stop the need for tini but it still can be usefull.

Example

mordhau:

afbeelding
Tested with the panel patch with ^C and ^SIGTERM

Copy link
Contributor

@RMartinOscar RMartinOscar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

	case os.Kill:
		sig = "SIGKILL"
		noWaitTimeout = 0

Those are useles because of the default case

Sorry i couldn't find how to make multiple line suggestion

@QuintenQVD0
Copy link
Contributor Author

QuintenQVD0 commented Jun 16, 2024

the docker default is SIGTERM and a timeout of 10seconds? and sigkill must happen inimitably ?

and that setting of sig is needed as you must pass a string and not a syscall signal directly?

@RMartinOscar
Copy link
Contributor

There's no use to match case for os.Kill when the consequence is the same as the default case

@QuintenQVD0
Copy link
Contributor Author

There's no use to match case for os.Kill when the consequence is the same as the default case

I changed it.

@QuintenQVD0
Copy link
Contributor Author

Converted to a draft as their is something wrong with the wings or panel logic itself

@QuintenQVD0 QuintenQVD0 marked this pull request as draft June 16, 2024 11:01
@QuintenQVD0 QuintenQVD0 changed the title Wings: FIx stoplogic Wings: Fix stoplogic Jun 16, 2024
@QuintenQVD0
Copy link
Contributor Author

IF you revert the this commit: 7bf9f6a

then stop works but kill is broken

@RMartinOscar
Copy link
Contributor

This may resolve it

@QuintenQVD0
Copy link
Contributor Author

This may resolve it

I worked with him and no. Their is something weong with our panel reporting container statuses. I can't realy point to what exactly but this pr does the same just a diffrend way.

@notAreYouScared
Copy link
Member

@QuintenQVD0 Any update with this?

@QuintenQVD0
Copy link
Contributor Author

@QuintenQVD0 Any update with this?

I have been working a lot latly so my time is limited. The logic is right but there something else preventing this to work with the state of the container. (most servers in pelican even on properly stop mark it as crashed) that it is skipping the stopping state and so it thinks it has crashed. So until that is fixt this should not be merged as then a server can not be stoped anymore.

Something is or went wrong with eather a panel update or the pr of Kubi to signal the panel of a state changes.

@QuintenQVD0
Copy link
Contributor Author

QuintenQVD0 commented Sep 7, 2024

Moral of this PR.

Fix the broken stop logic for signals (if pelican-dev/panel#407 is also merged ofc)
Fix the crash detection issue because of a pending api call to the panel that had to finish before the crash detection could running causing it not to see the server go thru the stopping state.

edit:
For those in the future or interested:
afbeelding

@QuintenQVD0 QuintenQVD0 marked this pull request as ready for review September 7, 2024 18:39
environment/docker/power.go Outdated Show resolved Hide resolved
Copy link
Contributor

@parkervcp parkervcp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@parkervcp parkervcp merged commit 2508f85 into pelican-dev:main Oct 26, 2024
6 checks passed
@QuintenQVD0 QuintenQVD0 deleted the stoplogic branch October 27, 2024 07:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants