Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] asynq:servers and asynq:workers not cleaned after shutdown #979

Closed
Skwol opened this issue Dec 6, 2024 · 6 comments · Fixed by #982
Closed

[BUG] asynq:servers and asynq:workers not cleaned after shutdown #979

Skwol opened this issue Dec 6, 2024 · 6 comments · Fixed by #982
Assignees
Labels
bug Something isn't working investigate

Comments

@Skwol
Copy link
Contributor

Skwol commented Dec 6, 2024

Describe the bug
redis asynq:servers and asynq:workers sets are note being cleaned up during server shutdown.

Environment (please complete the following information):

  • OS: macos (linux)
  • asynq package version v0.25.0
  • Redis 7

To Reproduce

  1. Run asynq locally with a redis in a docker container
  2. Stop the app and make sure that server.Shutdown() called.
  3. Open redis and verify that asynq:servers and asynq:workers entires were not removed.

Perform same actions but add time.Sleep(2*time.Second) after server.Shutdown() and you can see that closed workers/servers cleaned up.

Dockerfile:

FROM redis:7.0-alpine
CMD ["redis-server", "--appendonly", "yes"]

docker-compose.yaml:

version: "3.9"

services:
  redis:
    image: redis:7.0-alpine
    container_name: redis
    ports:
      - "6379:6379"
    volumes:
      - redis_data:/data

volumes:
  redis_data:

main.go:

package main

import (
	"github.com/hibiken/asynq"
	"log"
	"os"
	"os/signal"
	"syscall"
)

const redisAddr = "localhost:6379"

func main() {
	server := asynq.NewServer(asynq.RedisClientOpt{Addr: redisAddr}, asynq.Config{
		Concurrency: 10,
	})

	mux := asynq.NewServeMux()

	go func() {
		if err := server.Run(mux); err != nil {
			log.Fatalf("Asynq Server failed: %v", err)
		}
	}()

	signalChan := make(chan os.Signal, 1)
	signal.Notify(signalChan, syscall.SIGINT, syscall.SIGTERM)

	<-signalChan
	log.Println("Shutting down server...")
	server.Stop()
	server.Shutdown()
	//time.Sleep(2 * time.Second) // give the server some time to shutdown gracefully
	log.Println("Server shut down gracefully")
}

Expected behavior
redis sets asynq:servers and asynq:workers should not contain old servers/workers

Additional context
In some infrastructure where we might have a lot of workers these sets become huge during active development. Not sure when it will affect performance or cause any troubles.

@Skwol Skwol added the bug Something isn't working label Dec 6, 2024
@Skwol
Copy link
Contributor Author

Skwol commented Dec 6, 2024

As I understand the issue lies somewhere in the lock of the server.Shutdown(). In my code I call server.Run() and server.Shutdown(). And server.Run() actually already have signals waiting and Shutdown(). When I remove server.Shutdown() from a client code everything runs smoothly and clean up happens.

Ideally need to handle this case properly (lock) or remove shutdown from the public api.

@kamikazechaser
Copy link
Collaborator

kamikazechaser commented Dec 7, 2024

Server.Run internally registers SIGINT listeners. Could you use Server.Start instead because you are registering your own listeners.

Also, could you try with v0.24.1 and see if it is reproducible.

@Skwol
Copy link
Contributor Author

Skwol commented Dec 7, 2024

Tried the same with v0.24.1 and the result is the same. It requires use some delay/sleep to clean up.
You're 100% sure, I need to use Start + Shutdown or Run. It works for me and I'll be fine, but it looks like it better to have this case (case when user tries Run+Shutdown) restricted/handled somehow.

Anyway, thank you for the response.

@kamikazechaser
Copy link
Collaborator

I managed to reproduce it. It is in fact a bug. If the server state is never set with Stop the entire shutdown procedure is skipped. On Unix it is never stopped unless SIGSTP is called.

https://github.com/hibiken/asynq/blob/master/signals_unix.go#L25

The 2 second that you add allows the Stop to actually set the correct state which further allows the seconds explicit shutdown to complete before the goroutine can return.

I'll push some code that you can try out.

kamikazechaser added a commit that referenced this issue Dec 9, 2024
… for the shutdown procedure to complete successfully

* possibly fixes: #979
@kamikazechaser
Copy link
Collaborator

@Skwol Could you try:

go get github.com/hibiken/asynq@71c746d00af93fed64d0daafd8fb5e884f7cc243

@Skwol
Copy link
Contributor Author

Skwol commented Dec 9, 2024

I updated my code. The cleanup worked fine with this version. Thank you, hope to see it in the new release!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working investigate
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants