Process workflow logs in batches #4045

hg · 2024-08-16T10:36:40Z

Hopefully this removes the ~~first~~ bottleneck.

The benchmark mentioned in the issue now finishes in 2 seconds because its output fits into one batch. Anything that spills over into more batches hangs waiting for the server to write out the previous ones.

edit: original description, not relevant anymore.

Closes #3999
Ref: #2072
( Close #2064 )

6543 · 2024-08-16T11:31:58Z

in itselfe looks god, we just a:

make sure on graceful stop the last logs are commited
line numbers are still working correctly

hg · 2024-08-16T13:05:31Z

Thank you for the review. Line numbers aren't affected as log records are simply passed on unchanged in the same order as they were previously.

The queue should indeed be drained on shutdown, thanks for catching this. Should be fixed now. (As far as GRPC connection lifetime permits — I did not change it and don't think it should be touched by this PR).

hg · 2024-08-17T08:57:04Z

The second commit depends on the first one and is more of a demo at the moment, to show that the protocol breakage is worth it.

This job with 1 million lines of output:

when:
  event: push

steps:
  logs:
    image: ubuntu:24.04
    commands:
      - seq 1 1000000

finishes in 18-20 seconds when writing to disk, and in 5.5 minutes when writing to PostgreSQL.

IIRC, Drone shows better performance for the latter case by saving the entire massive log blob as one record. Probably not something the project is interested in.

6543 · 2024-08-17T10:53:04Z

yes drone let the agent stream the log and use that stream just for live logs and then at the end send the whole log again as blob to be stored. We want to have a single stream to both store and view the logs at the same time ... as it ensures that what you see is what is actually happening and later on you still can ensure that all seen logs are stored also the same.

In the end your pull finish what i tested and then had in mind at #2072 but never go to implement it ...

6543 · 2024-08-17T10:54:09Z

well it does not enhance the storing logs part... but that's as you wrote up for the next pull ...

hg · 2024-08-17T20:01:38Z

line numbers are still working correctly

I misunderstood this: the log lines were indeed getting reordered, but only for live streaming web clients. It should be fixed now, but with limited Go knowledge I only came up with a solution that blocks in Write() when any subscriber's log channel is full. The blocking is prevented by dropping logs to that subscriber on overflow, which might result in gaps.

While this shouldn't happen much in practice (each channel stores up to 100 seconds of logs, or up to about 25 megabytes of text, whichever limit comes first), it's still a limitation that can probably be removed. If you're not happy about it -- I'll try and explore some more.

well it does not enhance the storing logs part

It solves the problem for file storage, which is likely already about as fast as it's going to get; less so for db. It's not taking hours anymore, but saving only 12 megabytes of logs per minute is still not ideal. I'm not sure what else can be done about it tbh, besides increasing granularity from "1 log line = 1 db row" to "full logs for 1 step = 1 db row" (like Drone does), or "logs printed at [time XX truncated to the second] = 1 db row".

anbraten · 2024-08-19T15:00:40Z

Speeding up the saving of log entries to the database could be done by inserting batches directly (https://xorm.io/docs/chapter-04/readme/) and if still necessary dropping the primary id auto increment and replacing it with sth like uuid and so on.

However before adjusting any furthercode parts it would be awesome if we to do some profiling to determine the actual bottlenecks to not complicate the code in unrelated parts.

hg · 2024-08-19T15:12:02Z

Sure, records are already saved in batches, regardless of which type of storage is used. Everything but the database storage is relatively efficient (if file storage is used, the server can process ~50k log rows/second compared to ~100-150 rows/second on 2.7.0/next), and fixing that requires changes to the storage format (i.e. the log_entries table), which is beyond the scope of this PR.

agent/rpc/client_grpc.go

server/api/stream.go

server/logging/log.go

server/services/log/file/file.go

woodpecker-bot · 2024-08-25T20:23:28Z

Tearing down https://woodpecker-ci-woodpecker-pr-4045.surge.sh

6543 · 2024-09-05T13:08:09Z

if you need help to resolve the conflict just ask :)

hg · 2024-09-07T09:08:19Z

No problem, I just haven't had the time lately.

FWIW, I've been using this in production for the past few weeks, and it at least works well… the part that pushes updates to web clients is less than ideal, though.

server/grpc/rpc.go

server/logging/log_test.go

6543 · 2024-09-16T17:37:02Z

beside this two nits it looks good to go 🎉

6543 · 2024-09-18T14:31:16Z

thanks for the awesome work!!!

6543 · 2024-09-26T11:03:59Z

@hg if you want you could join the maintainers group and or get some swag :)

hg · 2024-09-26T16:31:09Z

@6543

you could join the maintainers group

Not sure if I've earned it yet, but if other maintainers don't mind, why not. I don't have a lot of time right now, but I should be able to start working on the project in a week or so.

As for the swag, I live very far away from both the US and Europe, and shipping will cost more than the swag itself, but thank you for the offer.

6543 · 2024-09-27T06:59:33Z

In this case you might DM me via matrix https://matrix.to/#/@marddl:obermui.de if you find a timeslot, so i can onbord u.

If you dont have an account, just use the matrix.org to create one ...

6543 · 2024-11-11T17:47:11Z

-> #4356

Co-authored-by: hg <k@isakov.net>

6543 added server agent enhancement improve existing features labels Aug 16, 2024

hg force-pushed the issue/3999 branch from 8282fc6 to 9de736d Compare August 16, 2024 13:05

hg force-pushed the issue/3999 branch from 9de736d to 7f36b3a Compare August 16, 2024 13:58

qwerty287 changed the title ~~Send agent → server logs in batches (#3999)~~ Send agent → server logs in batches Aug 16, 2024

hg force-pushed the issue/3999 branch from 7f36b3a to 21bf5e7 Compare August 17, 2024 08:55

hg changed the title ~~Send agent → server logs in batches~~ Process workflow logs in batches Aug 17, 2024

hg added 3 commits August 17, 2024 18:57

Send agent → server logs in batches (woodpecker-ci#3999)

ab1bb3f

Write out log entries in batches (woodpecker-ci#3999)

5afdc25

Prevent log line reordering when streaming to web clients

d4f589d

hg force-pushed the issue/3999 branch from 21bf5e7 to d4f589d Compare August 17, 2024 20:00

zc-devs reviewed Aug 21, 2024

View reviewed changes

agent/rpc/client_grpc.go Outdated Show resolved Hide resolved

agent/rpc/client_grpc.go Outdated Show resolved Hide resolved

server/api/stream.go Outdated Show resolved Hide resolved

server/logging/log.go Show resolved Hide resolved

server/services/log/file/file.go Show resolved Hide resolved

hg added 3 commits August 22, 2024 00:54

First review

215d1da

Invalid order of operations on mixed step logs

0e0f6a3

Use the same limit for maxLogBatchSize as elsewhere in the code

92f065b

cspell: ignore package import alias

7679f98

6543 added the build_pr_images If set, the CI will build images for this PR and push to Dockerhub label Sep 5, 2024

Merge remote-tracking branch 'origin/main' into issue/3999

06f3ae4

hg and others added 4 commits September 15, 2024 19:42

Merge remote-tracking branch 'origin/main' into issue/3999

59474c9

Regenerate mocks by running go:generate

ed85425

Merge branch 'main' into issue/3999

6f08ce3

generate proto code from version used in flake

239c04e

6543 reviewed Sep 16, 2024

View reviewed changes

server/grpc/rpc.go Outdated Show resolved Hide resolved

6543 reviewed Sep 16, 2024

View reviewed changes

server/logging/log_test.go Outdated Show resolved Hide resolved

hg added 3 commits September 18, 2024 13:43

Merge remote-tracking branch 'origin/main' into issue/3999

ba9c4ee

Close msg receiver chan in TestLogging()

4b01e91

Fix golangci-lint warnings

1852254

6543 approved these changes Sep 18, 2024

View reviewed changes

6543 merged commit 276b279 into woodpecker-ci:main Sep 18, 2024
6 of 7 checks passed

woodpecker-bot mentioned this pull request Sep 18, 2024

🎉 Release 3.0.0 #4097

Merged

1 task

6543 removed the build_pr_images If set, the CI will build images for this PR and push to Dockerhub label Sep 18, 2024

6543 pushed a commit to 6543-forks/woodpecker that referenced this pull request Nov 11, 2024

Process workflow logs in batches (woodpecker-ci#4045)

d279e4c

6543 mentioned this pull request Nov 11, 2024

Process workflow logs in batches (#4045) #4356

Merged

6543 added the backport-done indicates that this pull has been backported label Nov 11, 2024

6543 added a commit that referenced this pull request Nov 13, 2024

Process workflow logs in batches (#4045) (#4356)

cbe74ec

Co-authored-by: hg <k@isakov.net>

woodpecker-bot mentioned this pull request Nov 13, 2024

🎉 Release 2.8.0 #4304

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Process workflow logs in batches #4045

Process workflow logs in batches #4045

hg commented Aug 16, 2024 •

edited by 6543

Loading

6543 commented Aug 16, 2024

hg commented Aug 16, 2024

hg commented Aug 17, 2024

6543 commented Aug 17, 2024

6543 commented Aug 17, 2024

hg commented Aug 17, 2024

anbraten commented Aug 19, 2024

hg commented Aug 19, 2024 •

edited

Loading

woodpecker-bot commented Aug 25, 2024 •

edited

Loading

6543 commented Sep 5, 2024

hg commented Sep 7, 2024

6543 commented Sep 16, 2024

6543 commented Sep 18, 2024

6543 commented Sep 26, 2024

hg commented Sep 26, 2024

6543 commented Sep 27, 2024

6543 commented Nov 11, 2024

Process workflow logs in batches #4045

Process workflow logs in batches #4045

Conversation

hg commented Aug 16, 2024 • edited by 6543 Loading

6543 commented Aug 16, 2024

hg commented Aug 16, 2024

hg commented Aug 17, 2024

6543 commented Aug 17, 2024

6543 commented Aug 17, 2024

hg commented Aug 17, 2024

anbraten commented Aug 19, 2024

hg commented Aug 19, 2024 • edited Loading

woodpecker-bot commented Aug 25, 2024 • edited Loading

6543 commented Sep 5, 2024

hg commented Sep 7, 2024

6543 commented Sep 16, 2024

6543 commented Sep 18, 2024

6543 commented Sep 26, 2024

hg commented Sep 26, 2024

6543 commented Sep 27, 2024

6543 commented Nov 11, 2024

hg commented Aug 16, 2024 •

edited by 6543

Loading

hg commented Aug 19, 2024 •

edited

Loading

woodpecker-bot commented Aug 25, 2024 •

edited

Loading