-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PRs stuck on "Merge conflict checking is in progress. Try again in few moments." #17204
Comments
Just an update on this one, I've observed no change over the weekend (having wondered if any of the longer term health checks would resolve this -- a stretch, I know).
I'm not sure if the failure here lies with merge checking failing (on the Gitea side, or the Of the PRs stuck in this state, I have seen:
Very strange behaviour. An alternative for me here would be some way to 'recreate' all the PRs in this state, but I'm not sure how I'd remove the original PR to avoid Gitea saying one already exists.... |
This is a separate issue. Both issues occasionally pop up here, but so far nobody was successful to track them down. Please enable debug logging and provide debug logs the next time this occurs |
Same problem. Enabled debug logging and force-pushed the branch in question. logs
|
What version are you running? Show us your app.ini. Tell us what the current number of workers are for the pr_patch_checker-channel and pr_patch_checker queues? |
gitea 1.16.1 Redacted `app.ini`:
I didn't alter any defaults but here's the current queue config: |
I bet this problem will be fixed by #18658 and its backport #18672 - both of which are already merged. Please consider updating to the current 1.16 head - On docker you can use the 1.16-dev tag or just download https://dl.gitea.io/gitea/1.16 - It would be helpful to know if that solves the problem. If you cannot upgrade - you could set: [queue.pr_patch_checker]
WORKERS=1 Which should ensure that there is always a worker available. If you have a lot of PRs you may want to consider changing your underlying queue type to |
I'll try the worker count setting 👍
Is 20-50 open PRs considered a lot? |
Seriously please consider just trying the latest 1.16 head. It's the only way I'll know if this problem has been solved. |
Ok, i'll test 👍 |
Currently testing with Edit: multiple PRs |
Sorry I missed your reply earlier - the easiest way to force a prs to be updated is to go to the repository settings page and either create, edit or delete some branch protection. That should cause all PRs against the baseBranch to be rechecked.
When you say this is is this since restarting or are these remaining stuck? It'd be useful to see if there are any logs. Monitoring should show if the PR patch checker is doing any PR checking - the number of workers would be one thing and another would be each PR as it is checked should get its own process |
The newer 1.16 version didn't improve the situation. I also tried to edit our current master branch protection but this didn't help either. Some PRs are stuck with We had never a problem with v1.15.10 or earlier. This has to be a regression introduced in v1.16.0 or v1.16.1 I pushed to a stuck PR while running logs
|
I'm trying to find out what and where the problem is.
This means that the worker for the channel is actually being created.
Now, unfortunately there's no trace log messages in the process handler - mea culpa I should have put them there - but the worker pool so we can trace if any work is being popped off the workerpool: ./gitea manager logging add console --name traceconsole --level TRACE --expression "modules/queue" would add trace logging to the queues allowing direct logging of all the actions taken by all the queues. We could simply change that to "modules/queue/workerpool" or add it to the app.ini: [log]
MODE = whatever_you_had_before, traceconsole
...
[log.traceconsole]
MODE=console
LEVEL=trace
EXPRESSION=modules/queue Now I'm gonna assume that the issue isn't that the queue infrastructure is blocking - as between v1.15.11 and v1.16.1 there have been minimal changes in the queues code. Which comes back to the process manager tool: on https://your.gitea.domain/admin/monitor at the bottom of the screen is a process manager. When a patch begins to be tested a process is created: Line 229 in 393ea86
Now assuming that patches are starting to be checked, I don't see any errors starting with Lines 232 to 236 in 393ea86
They're presumably not already marked merged: Lines 238 to 240 in 393ea86
Meaning we need to look into Lines 242 to 244 in 393ea86
Lines 141 to 194 in 393ea86
Ultimately I don't see: Line 190 in 393ea86
So I don't think it's this meaning that the problem is likely somewhere in here: Lines 246 to 254 in 393ea86
TestPatch starts with creating a child process: "TestPatch: Repo[%d]#%d", pr.BaseRepoID, pr.Index Lines 56 to 57 in 393ea86
Indicates this: Line 61 in 393ea86
|
Comes from: Line 79 in 393ea86
Is: Line 226 in 393ea86
in: Line 221 in 393ea86
ultimately called by: Line 274 in 393ea86
Come from: Line 128 in 393ea86
Which is called by: Line 256 in 393ea86
Which is in: Line 221 in 393ea86
Comes from: Line 281 in 393ea86
Implying that false is returned here - (meaning no conflicts): Line 297 in 393ea86
So I guess the problem is that It would be useful to see an image of how these conflicted PRs appear on the merge screen. Is the merge icon green but conflicts are still reported? |
Modified the app.ini and restarted.
Pushed to the PR but the process manager only shows a GET for the monitor page itself. I couldn't also find any running
on a different PR: New log with active trace options:
Relevant output from
|
Your new logs are not showing any TRACE logs. (Or duplicate DEBUG logs for the workerpool) So the new logging is still not being shown. |
Hmm... I wonder if the problem is actually in the |
Lines 50 to 69 in 393ea86
|
Does the trace log end up in the logfile or do i need to run gitea interactively and check the console output? |
I can also run a custom Linux amd64 build if that helps. |
That would depend on the logging settings you added - the above examples I gave were all console logging as I find that easiest to review quickly. If you're running docker: If you wanted the trace logging to go to a file you'd simply: [log.traceconsole]
MODE=file
LEVEL=trace
EXPRESSION=modules/queue
FILE_NAME=<filename> You can adjust logging on a running gitea by running the In terms of building a Gitea I will post a patch below that can be applied to main - and probably 1.16 that will potentially fix an issue with conflicted files not being emptied and add some more trace logging to the patch checker and process manager You would then need to adjust the above |
This is a patch against main. Remember to copy the final new line. It should be able to be applied with `git apply`diff --git a/modules/process/manager.go b/modules/process/manager.go
index d9d2f8c3e..3692f290f 100644
--- a/modules/process/manager.go
+++ b/modules/process/manager.go
@@ -15,6 +15,8 @@ import (
"strconv"
"sync"
"time"
+
+ "code.gitea.io/gitea/modules/log"
)
// TODO: This packages still uses a singleton for the Manager.
@@ -103,6 +105,7 @@ func (pm *Manager) AddContextTimeout(parent context.Context, timeout time.Durati
func (pm *Manager) Add(parentPID IDType, description string, cancel context.CancelFunc) (IDType, FinishedFunc) {
pm.mutex.Lock()
start, pid := pm.nextPID()
+ log.Trace("Adding Process[%d:%d] %s", parentPID, pid, description)
parent := pm.processes[parentPID]
if parent == nil {
@@ -120,6 +123,7 @@ func (pm *Manager) Add(parentPID IDType, description string, cancel context.Canc
finished := func() {
cancel()
pm.remove(process)
+ log.Trace("Finished Process[%d:%d] %s", parentPID, pid, description)
}
if parent != nil {
diff --git a/modules/queue/workerpool.go b/modules/queue/workerpool.go
index 100197c5e..09316b9df 100644
--- a/modules/queue/workerpool.go
+++ b/modules/queue/workerpool.go
@@ -484,7 +484,7 @@ func (p *WorkerPool) doWork(ctx context.Context) {
case <-paused:
log.Trace("Worker for Queue %d Pausing", p.qid)
if len(data) > 0 {
- log.Trace("Handling: %d data, %v", len(data), data)
+ log.Trace("Queue[%d] Handling: %d data, %v", p.qid, len(data), data)
if unhandled := p.handle(data...); unhandled != nil {
log.Error("Unhandled Data in queue %d", p.qid)
}
@@ -507,7 +507,7 @@ func (p *WorkerPool) doWork(ctx context.Context) {
// go back around
case <-ctx.Done():
if len(data) > 0 {
- log.Trace("Handling: %d data, %v", len(data), data)
+ log.Trace("Queue[%d] Handling: %d data, %v", p.qid, len(data), data)
if unhandled := p.handle(data...); unhandled != nil {
log.Error("Unhandled Data in queue %d", p.qid)
}
@@ -519,7 +519,7 @@ func (p *WorkerPool) doWork(ctx context.Context) {
if !ok {
// the dataChan has been closed - we should finish up:
if len(data) > 0 {
- log.Trace("Handling: %d data, %v", len(data), data)
+ log.Trace("Queue[%d] Handling: %d data, %v", p.qid, len(data), data)
if unhandled := p.handle(data...); unhandled != nil {
log.Error("Unhandled Data in queue %d", p.qid)
}
@@ -532,7 +532,7 @@ func (p *WorkerPool) doWork(ctx context.Context) {
util.StopTimer(timer)
if len(data) >= p.batchLength {
- log.Trace("Handling: %d data, %v", len(data), data)
+ log.Trace("Queue[%d] Handling: %d data, %v", p.qid, len(data), data)
if unhandled := p.handle(data...); unhandled != nil {
log.Error("Unhandled Data in queue %d", p.qid)
}
@@ -544,7 +544,7 @@ func (p *WorkerPool) doWork(ctx context.Context) {
case <-timer.C:
delay = time.Millisecond * 100
if len(data) > 0 {
- log.Trace("Handling: %d data, %v", len(data), data)
+ log.Trace("Queue[%d] Handling: %d data, %v", p.qid, len(data), data)
if unhandled := p.handle(data...); unhandled != nil {
log.Error("Unhandled Data in queue %d", p.qid)
}
diff --git a/services/pull/check.go b/services/pull/check.go
index b1e9237d1..d3da2bd7f 100644
--- a/services/pull/check.go
+++ b/services/pull/check.go
@@ -62,9 +62,12 @@ func checkAndUpdateStatus(pr *models.PullRequest) {
}
if !has {
+ log.Trace("Updating PR[%d] in %d: Status:%d Conflicts:%s Protected:%s", pr.ID, pr.BaseRepoID, pr.Status, pr.ConflictedFiles, pr.ChangedProtectedFiles)
if err := pr.UpdateColsIfNotMerged("merge_base", "status", "conflicted_files", "changed_protected_files"); err != nil {
log.Error("Update[%d]: %v", pr.ID, err)
}
+ } else {
+ log.Trace("Not updating PR[%d] in %d as still in the queue", pr.ID, pr.BaseRepoID)
}
}
@@ -234,12 +237,15 @@ func testPR(id int64) {
log.Error("GetPullRequestByID[%d]: %v", id, err)
return
}
+ log.Trace("Testing PR[%d] in %d", pr.ID, pr.BaseRepoID)
if pr.HasMerged {
+ log.Trace("PR[%d] in %d: already merged", pr.ID, pr.BaseRepoID)
return
}
if manuallyMerged(ctx, pr) {
+ log.Trace("PR[%d] in %d: manually merged", pr.ID, pr.BaseRepoID)
return
}
@@ -251,6 +257,8 @@ func testPR(id int64) {
}
return
}
+ log.Trace("PR[%d] in %d: patch tested new Status:%d ConflictedFiles:%s ChangedProtectedFiles:%s", pr.ID, pr.BaseRepoID, pr.Status, pr.ConflictedFiles, pr.ChangedProtectedFiles)
+
checkAndUpdateStatus(pr)
}
diff --git a/services/pull/patch.go b/services/pull/patch.go
index a2c834532..e01349fcb 100644
--- a/services/pull/patch.go
+++ b/services/pull/patch.go
@@ -280,17 +280,21 @@ func checkConflicts(ctx context.Context, pr *models.PullRequest, gitRepo *git.Re
if !conflict {
treeHash, err := git.NewCommand(ctx, "write-tree").RunInDir(tmpBasePath)
if err != nil {
+ log.Debug("Unable to write unconflicted tree for PR[%d] %s/%s#%d. Error: %v", pr.ID, pr.BaseRepo.OwnerName, pr.BaseRepo.Name, pr.Index, err)
return false, err
}
treeHash = strings.TrimSpace(treeHash)
baseTree, err := gitRepo.GetTree("base")
if err != nil {
+ log.Debug("Unable to get base tree for PR[%d] %s/%s#%d. Error: %v", pr.ID, pr.BaseRepo.OwnerName, pr.BaseRepo.Name, pr.Index, err)
return false, err
}
+ pr.Status = models.PullRequestStatusMergeable
+ pr.ConflictedFiles = []string{}
+
if treeHash == baseTree.ID.String() {
log.Debug("PullRequest[%d]: Patch is empty - ignoring", pr.ID)
pr.Status = models.PullRequestStatusEmpty
- pr.ConflictedFiles = []string{}
pr.ChangedProtectedFiles = []string{}
}
|
The ConflictedFiles status should always be reset if there are no conflicts this prevents conflicted files being left over. Fix go-gitea#17204 Signed-off-by: Andrew Thornton <art27@cantab.net>
Unpatchted log from `journalctl`
|
is PR 6377 one these PRs that is not getting its status properly updated? |
The PR touched by this push is |
Ah, my bad. This seems to be the database ID of the PR which is correct in this case.
|
@zeripath could you provide a patch for v1.16 ? |
Patch against 1.16diff --git a/modules/process/manager.go b/modules/process/manager.go
index 7cde9f945..4fd7e15ac 100644
--- a/modules/process/manager.go
+++ b/modules/process/manager.go
@@ -15,6 +15,8 @@ import (
"strconv"
"sync"
"time"
+
+ "code.gitea.io/gitea/modules/log"
)
// TODO: This packages still uses a singleton for the Manager.
@@ -103,6 +105,7 @@ func (pm *Manager) AddContextTimeout(parent context.Context, timeout time.Durati
func (pm *Manager) Add(parentPID IDType, description string, cancel context.CancelFunc) (IDType, FinishedFunc) {
pm.mutex.Lock()
start, pid := pm.nextPID()
+ log.Trace("Adding Process[%s:%s] %s", parentPID, pid, description)
parent := pm.processes[parentPID]
if parent == nil {
@@ -120,6 +123,7 @@ func (pm *Manager) Add(parentPID IDType, description string, cancel context.Canc
finished := func() {
cancel()
pm.remove(process)
+ log.Trace("Finished Process[%s:%s] %s", parentPID, pid, description)
}
if parent != nil {
@@ -260,7 +264,6 @@ func (pm *Manager) ExecDirEnvStdIn(timeout time.Duration, dir, desc string, env
}
err := cmd.Wait()
-
if err != nil {
err = &Error{
PID: GetPID(ctx),
diff --git a/modules/queue/queue_bytefifo.go b/modules/queue/queue_bytefifo.go
index edde47a62..79af260f3 100644
--- a/modules/queue/queue_bytefifo.go
+++ b/modules/queue/queue_bytefifo.go
@@ -312,5 +312,7 @@ func (q *ByteFIFOUniqueQueue) Has(data Data) (bool, error) {
if err != nil {
return false, err
}
- return q.byteFIFO.(UniqueByteFIFO).Has(q.terminateCtx, bs)
+ has, err := q.byteFIFO.(UniqueByteFIFO).Has(q.terminateCtx, bs)
+ log.Trace("Queue[%d:%s] Has(%v)=bs[%s] %t, %v", q.qid, q.name, data, string(bs), has, err)
+ return has, err
}
diff --git a/modules/queue/unique_queue_channel.go b/modules/queue/unique_queue_channel.go
index f617595c0..4fa8d80b2 100644
--- a/modules/queue/unique_queue_channel.go
+++ b/modules/queue/unique_queue_channel.go
@@ -70,6 +70,7 @@ func NewChannelUniqueQueue(handle HandlerFunc, cfg, exemplar interface{}) (Queue
bs, _ := json.Marshal(datum)
queue.lock.Lock()
+ log.Trace("Queue[%d:%s] Removing from Table: %s", queue.qid, queue.name, string(bs))
delete(queue.table, string(bs))
queue.lock.Unlock()
@@ -116,6 +117,7 @@ func (q *ChannelUniqueQueue) PushFunc(data Data, fn func() error) error {
}
// FIXME: We probably need to implement some sort of limit here
// If the downstream queue blocks this table will grow without limit
+ log.Trace("Queue[%d:%s] Adding to Table: %s", q.qid, q.name, string(bs))
q.table[string(bs)] = true
if fn != nil {
err := fn()
@@ -140,6 +142,7 @@ func (q *ChannelUniqueQueue) Has(data Data) (bool, error) {
q.lock.Lock()
defer q.lock.Unlock()
_, has := q.table[string(bs)]
+ log.Trace("Queue[%d:%s] Has: %s", q.qid, q.name, string(bs))
return has, nil
}
diff --git a/modules/queue/unique_queue_disk_channel.go b/modules/queue/unique_queue_disk_channel.go
index af42c0913..4e354816c 100644
--- a/modules/queue/unique_queue_disk_channel.go
+++ b/modules/queue/unique_queue_disk_channel.go
@@ -147,6 +147,7 @@ func (q *PersistableChannelUniqueQueue) Has(data Data) (bool, error) {
// This is more difficult...
has, err := q.channelQueue.Has(data)
if err != nil || has {
+ log.Trace("Queue[%d:%s] Has(%v) %t,%v", q.channelQueue.qid, q.name, has, err)
return has, err
}
q.lock.Lock()
diff --git a/modules/queue/workerpool.go b/modules/queue/workerpool.go
index 37d518aa8..3b52d0ccd 100644
--- a/modules/queue/workerpool.go
+++ b/modules/queue/workerpool.go
@@ -383,7 +383,7 @@ func (p *WorkerPool) doWork(ctx context.Context) {
select {
case <-ctx.Done():
if len(data) > 0 {
- log.Trace("Handling: %d data, %v", len(data), data)
+ log.Trace("Queue[%d] Handling: %d data, %v", p.qid, len(data), data)
p.handle(data...)
atomic.AddInt64(&p.numInQueue, -1*int64(len(data)))
}
@@ -393,7 +393,7 @@ func (p *WorkerPool) doWork(ctx context.Context) {
if !ok {
// the dataChan has been closed - we should finish up:
if len(data) > 0 {
- log.Trace("Handling: %d data, %v", len(data), data)
+ log.Trace("Queue[%d] Handling: %d data, %v", p.qid, len(data), data)
p.handle(data...)
atomic.AddInt64(&p.numInQueue, -1*int64(len(data)))
}
@@ -402,7 +402,7 @@ func (p *WorkerPool) doWork(ctx context.Context) {
}
data = append(data, datum)
if len(data) >= p.batchLength {
- log.Trace("Handling: %d data, %v", len(data), data)
+ log.Trace("Queue[%d] Handling: %d data, %v", p.qid, len(data), data)
p.handle(data...)
atomic.AddInt64(&p.numInQueue, -1*int64(len(data)))
data = make([]Data, 0, p.batchLength)
@@ -413,7 +413,7 @@ func (p *WorkerPool) doWork(ctx context.Context) {
case <-ctx.Done():
util.StopTimer(timer)
if len(data) > 0 {
- log.Trace("Handling: %d data, %v", len(data), data)
+ log.Trace("Queue[%d] Handling: %d data, %v", p.qid, len(data), data)
p.handle(data...)
atomic.AddInt64(&p.numInQueue, -1*int64(len(data)))
}
@@ -424,7 +424,7 @@ func (p *WorkerPool) doWork(ctx context.Context) {
if !ok {
// the dataChan has been closed - we should finish up:
if len(data) > 0 {
- log.Trace("Handling: %d data, %v", len(data), data)
+ log.Trace("Queue[%d] Handling: %d data, %v", p.qid, len(data), data)
p.handle(data...)
atomic.AddInt64(&p.numInQueue, -1*int64(len(data)))
}
@@ -433,7 +433,7 @@ func (p *WorkerPool) doWork(ctx context.Context) {
}
data = append(data, datum)
if len(data) >= p.batchLength {
- log.Trace("Handling: %d data, %v", len(data), data)
+ log.Trace("Queue[%d] Handling: %d data, %v", p.qid, len(data), data)
p.handle(data...)
atomic.AddInt64(&p.numInQueue, -1*int64(len(data)))
data = make([]Data, 0, p.batchLength)
@@ -441,7 +441,7 @@ func (p *WorkerPool) doWork(ctx context.Context) {
case <-timer.C:
delay = time.Millisecond * 100
if len(data) > 0 {
- log.Trace("Handling: %d data, %v", len(data), data)
+ log.Trace("Queue[%d] Handling: %d data, %v", p.qid, len(data), data)
p.handle(data...)
atomic.AddInt64(&p.numInQueue, -1*int64(len(data)))
data = make([]Data, 0, p.batchLength)
diff --git a/services/pull/check.go b/services/pull/check.go
index 363a716b2..cdfda4682 100644
--- a/services/pull/check.go
+++ b/services/pull/check.go
@@ -61,9 +61,12 @@ func checkAndUpdateStatus(pr *models.PullRequest) {
}
if !has {
+ log.Trace("Updating PR[%d] in %d: Status:%d Conflicts:%s Protected:%s", pr.ID, pr.BaseRepoID, pr.Status, pr.ConflictedFiles, pr.ChangedProtectedFiles)
if err := pr.UpdateColsIfNotMerged("merge_base", "status", "conflicted_files", "changed_protected_files"); err != nil {
log.Error("Update[%d]: %v", pr.ID, err)
}
+ } else {
+ log.Trace("Not updating PR[%d] in %d as still in the queue", pr.ID, pr.BaseRepoID)
}
}
@@ -225,11 +228,20 @@ func handle(data ...queue.Data) {
if err != nil {
log.Error("GetPullRequestByID[%s]: %v", datum, err)
continue
- } else if pr.HasMerged {
+ }
+ log.Trace("Testing PR[%d] in %d", pr.ID, pr.BaseRepoID)
+
+ if pr.HasMerged {
+ log.Trace("PR[%d] in %d: already merged", pr.ID, pr.BaseRepoID)
continue
- } else if manuallyMerged(pr) {
+ }
+
+ if manuallyMerged(pr) {
+ log.Trace("PR[%d] in %d: manually merged", pr.ID, pr.BaseRepoID)
continue
- } else if err = TestPatch(pr); err != nil {
+ }
+
+ if err = TestPatch(pr); err != nil {
log.Error("testPatch[%d]: %v", pr.ID, err)
pr.Status = models.PullRequestStatusError
if err := pr.UpdateCols("status"); err != nil {
@@ -237,6 +249,8 @@ func handle(data ...queue.Data) {
}
continue
}
+
+ log.Trace("PR[%d] in %d: patch tested new Status:%d ConflictedFiles:%s ChangedProtectedFiles:%s", pr.ID, pr.BaseRepoID, pr.Status, pr.ConflictedFiles, pr.ChangedProtectedFiles)
checkAndUpdateStatus(pr)
}
}
diff --git a/services/pull/patch.go b/services/pull/patch.go
index ee10c9739..36240afb3 100644
--- a/services/pull/patch.go
+++ b/services/pull/patch.go
@@ -265,17 +265,21 @@ func checkConflicts(pr *models.PullRequest, gitRepo *git.Repository, tmpBasePath
if !conflict {
treeHash, err := git.NewCommandContext(ctx, "write-tree").RunInDir(tmpBasePath)
if err != nil {
+ log.Debug("Unable to write unconflicted tree for PR[%d] %s/%s#%d. Error: %v", pr.ID, pr.BaseRepo.OwnerName, pr.BaseRepo.Name, pr.Index, err)
return false, err
}
treeHash = strings.TrimSpace(treeHash)
baseTree, err := gitRepo.GetTree("base")
if err != nil {
+ log.Debug("Unable to get base tree for PR[%d] %s/%s#%d. Error: %v", pr.ID, pr.BaseRepo.OwnerName, pr.BaseRepo.Name, pr.Index, err)
return false, err
}
+ pr.Status = models.PullRequestStatusMergeable
+ pr.ConflictedFiles = []string{}
+
if treeHash == baseTree.ID.String() {
log.Debug("PullRequest[%d]: Patch is empty - ignoring", pr.ID)
pr.Status = models.PullRequestStatusEmpty
- pr.ConflictedFiles = []string{}
pr.ChangedProtectedFiles = []string{}
}
|
more logs
|
Yup it looks like the value is stuck in the level db and I'm not certain why this is the case. Do you see in your logs:
or
NB: you can use <details> blocks to hide the long logs
|
grep'ed the log searching for pr_patch_checker logs
|
And so we can see the problem. The queue is empty by Len but is not empty by Has... I'm not sure how this can happen. I think your level queues are messed up. Run:
Wait for it to finish. Shutdown Gitea and delete the /data/queues/common folder. Restart. |
I think we need to include some doctor commands to check these queues for consistency. I don't understand how a queue could contain a set entry but not contain anything in the list itself. This makes no sense to me - the two are tightly linked and the problem makes me concerned that @somera 's mirror problem is similar. |
The checker was probably a bit overwhelmed. The PRs seem to gradually unstuck now. |
OK so nothing that in the above patches really helped this issue - in particular my "fix" wasn't that useful as the problem isn't that the checker wasn't doing its job correctly - rather that it was but then it was being told to hold off by the corrupt on-disk queue. I think I have to create a PR that adds the tracing (a little cleaner as we should avoid turning the []byte in to string unnecessarily) but it looks like I'll need to consider adding a doctor command to interrogate the queues more deeply. I'd be interested to know if you can cause this to happen again by shutting down and restarting with data in the queue - or if you downgraded at some point. You should be able to add extra workers - providing a temporary boost if necessary. I recommend you stay on 1.16-head for the moment - if just for #18672 . We'll get 1.16.2 out soon. |
But shouldn't leveldb detect a corruption in that case? Maybe you're hitting an upstream bug.
Nope. Only applied regular updates and i usually tend to wait for the first patch release. |
I think we're essentially the upstream - certainly the levelqueue.Queue/levelqueue.UniqueQueue function comes from code I wrote in levelqueue - but it's still not clear to me how this can happen. I'll have another look at the autoshutdown code - perhaps this is incorrect?
Perhaps somewhere along the line the pr_patch_checker queue has been opened and flushed as if it is Still I just don't understand how this can happen. |
Can you reproduce it on your local instance? Or https://try.gitea.io/ ? |
If I could I'd have resolved the problem already. |
Would it help if I switch to DEBUG/TRACE and send you the logs? |
No I don't think so - I suspect in your case you similarly need to do: Run:
Wait for it to finish. Shutdown Gitea and delete the /data/queues/common folder. Restart. |
And then get back to me if the problem recurs. |
OK. I did this and the mirror cron is working now. Should now update all my mirrors. |
The cron stopped working after syncing ~220 mirrors. |
Let's go back to your issue instead of this one. |
@zeripath i found an outdated PR in our repo which keeps the conflict detection running for quite some time. How would gitea handle such a long running job in case of a shutdown? Maybe something similar could be used to reproduce the queue corruption. |
That shouldn't cause a problem with the queue as once checking occurs it's popped off the queue. Regarding the pr conflict detection taking a long time - that's a worry presumably this is related to the old create a patch and attempt to apply path rather than the newer mechanism. I guess I still need to make some improvements to the fallback old pr checker mechanism. |
I'd actually just pulled this issue back up again as a large number of our open PRs fell back into this state after an upgrade from 1.15 to 1.16 a few weeks ago. Anyway, thanks! I just upgraded to 1.16.1 and performed the steps you've described there, waited a while and that's solved the problem. There's about 80 open PRs and in my testing with logging trace it was almost as if it was timing out before getting through re-checking each. Bumping worker counts sped up the rate it went through them significantly, but it would still give up about 70% of the way through the lot, so probably not a time out. Unsure if "if you hit this problem, do the quoted actions and it'll be resolved" closes off this issue, but it has at least saved me from having to recreate each PR. I'll have to see if it ever comes up again! |
@abentpole if flush-queues solved things for you #18658 will fix this completely. |
Out dated, the code base has changed a lot, if there is still a problem, feel free to open a new issue. |
The ConflictedFiles status should always be reset if there are no conflicts this prevents conflicted files being left over. Fix go-gitea#17204 Signed-off-by: Andrew Thornton <art27@cantab.net>
Gitea Version
1.15.3
Git Version
2.17.1
Operating System
Ubuntu Server
How are you running Gitea?
I'm running Gitea on Docker, with
image: gitea/gitea:latest
Database
SQLite
Can you reproduce the bug on the Gitea demo site?
No
Log Gist
No response
Description
I'm not sure if it correlates with a recent update of Gitea, but all open PRs for a project are listed as "Merge conflict checking is in progress. Try again in few moments." I'm also running Drone CI checks, but these are running and completing as expected.
I was reviewing, updating and merging PRs leading up to this occurring.
Creating a new PR seems to work fine -- Drone runs its check just fine and the "This pull request can be merged automatically" appears. I've tried closing and opening one of the PRs with no change, I've also tried checking out the branch, manually merging master into it and pushing (updating the PR) -- Drone runs, finishes and I'm left with the PR in "Merge conflict checking is in progress".
I'm not seeing any lingering processes in the "Monitoring" section of Site Administration. What would the best kind of log configuration to help me drill down to a cause here, and is there a way to forcibly kick off the "merge conflict checking" process again for a PR?
I have also updated + restarted Gitea to see if that'd help, to no avail.
Any assistance would be appreciated!
Cheers.
Screenshots
No response
The text was updated successfully, but these errors were encountered: