Skip to content

Queues not terminating gracefully #23050

@brechtvl

Description

@brechtvl

Description

We are running into the issue where a Gitea restart happens during merge conflict checking, PRs remain stuck in conflict state forever. I think there's two distinct issues here, one is that these don't get unstuck on new pushes and perhaps that's best left for another report.

However the reason things get into this state in the first place seems to be a problem in shutdown terminating all the queues immediately instead of respecting the hammer time.

When starting ./gitea web and then do killall gitea , I see the following in the log:

2023/02/21 19:43:26 ...eful/manager_unix.go:149:handleSignals() [W] [63f510cd-4] PID 1500309. Received SIGTERM. Shutting down...
2023/02/21 19:43:26 cmd/web.go:271:listen() [I] [63f510cd-48] HTTP Listener: 0.0.0.0:3000 Closed
2023/02/21 19:43:26 ...eue/queue_channel.go:127:func1() [W] ChannelQueue: task-channel Terminated before completed flushing
2023/02/21 19:43:26 .../graceful/manager.go:205:doHammerTime() [W] Setting Hammer condition
2023/02/21 19:43:26 ...eue/queue_channel.go:127:func1() [W] ChannelQueue: notification-service-channel Terminated before completed flushing
2023/02/21 19:43:26 ...eue/queue_channel.go:127:func1() [W] ChannelQueue: push_update-channel Terminated before completed flushing
2023/02/21 19:43:26 ...eful/server_hooks.go:46:doShutdown() [I] [63f510cd-48] PID: 1500309 Listener ([::]:3000) closed.
2023/02/21 19:43:27 .../graceful/manager.go:224:doTerminate() [W] Terminating
2023/02/21 19:43:27 ...er/issues/indexer.go:201:2() [I] PID: 1500309 Issue Indexer closed
2023/02/21 19:43:27 ...eful/manager_unix.go:157:handleSignals() [W] PID: 1500309. Background context for manager closed - context canceled - Shutting down...
2023/02/21 19:43:27 cmd/web.go:183:runWeb() [I] PID: 1500309 Gitea Web Finished

This log is from my local test instance with configuration set to defaults as much as possible.

On our production instance that Terminated before completed flushing is leading to a lot of different errors as workers get terminated in the middle of what they're doing.

I can provide more detailed logs if needed, but maybe this is easy to redo on any instance.

Gitea Version

main (43405c3)

Can you reproduce the bug on the Gitea demo site?

No

Log Gist

No response

Screenshots

No response

Git Version

No response

Operating System

No response

How are you running Gitea?

Own build from main.

Database

None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions