Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podman logs --tail, with json-file, misses partial lines #19545

Closed
edsantiago opened this issue Aug 8, 2023 · 6 comments · Fixed by #19562
Closed

podman logs --tail, with json-file, misses partial lines #19545

edsantiago opened this issue Aug 8, 2023 · 6 comments · Fixed by #19562
Assignees
Labels
flakes Flakes from Continuous Integration kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@edsantiago
Copy link
Member

[ Split off from #18501 ]

Basically, podman does not correctly handle a log like this:

$ cat -vET ~/.local/share/containers/storage/overlay-containers/$cid/userdata/ctr.log
2023-08-07T19:56:34.223758260-06:00 stdout F podman^M$
2023-08-07T19:56:34.223758260-06:00 stdout P podman$
2023-08-07T19:56:34.223816701-06:00 stdout F ^M$             <------ this actually goes with the prior line
2023-08-07T19:56:34.223816701-06:00 stdout F podman^M$
$ bin/podman logs --tail 2 $cid | cat -vET
^M$
podman^M$

Causes flakes in e2e tests, because they use podman run -dt, which is IMHO a bug, except that it let us find this issues, so maybe it's not a bug. (I cleaned up run -it in #18343 but did not touch -dt)

Reproducer:

$ nl="        <----- basically doublequote, enter, doublequote, enter
"
$ cr="^M"      <-------- control-V for quote-next-character, control-M for CR
$ while :;do cid=$(bin/podman run --log-driver json-file -dt quay.io/libpod/testimage:20221018 sh -c "echo podman;echo podman;echo podman");bin/podman wait $cid;logs=$(bin/podman logs --tail 2 $cid); test "$logs" = "podman${cr}${nl}podman${cr}" || break;bin/podman rm $cid;done
@edsantiago edsantiago added the flakes Flakes from Continuous Integration label Aug 8, 2023
@Luap99
Copy link
Member

Luap99 commented Aug 8, 2023

From a quick look at the code it looks like it tries to handle this case but it is clearly no working properly. I take a stab at it.

@Luap99 Luap99 self-assigned this Aug 8, 2023
@Luap99 Luap99 added the kind/bug Categorizes issue or PR as related to a bug. label Aug 8, 2023
@Luap99
Copy link
Member

Luap99 commented Aug 8, 2023

Also FYI, json-file == k8s-file in case you see the same thing there as well

@edsantiago
Copy link
Member Author

This is an easy one; I have a fix in progress, with tests. Give me an hour or two... unless you already have it fixed.

@Luap99
Copy link
Member

Luap99 commented Aug 8, 2023

I have a fix (proper rework to simplify the code) but no tests written yet.

@edsantiago
Copy link
Member Author

I'll let you continue with your fix then, since it's probably better Go than mine. Here's a test, use whatever you like from it.

# #19545
@test "podman logs --tail, json-file with partial lines" {
    cname="tail_container"

    # "-t" gives us ^Ms (CRs) in the log
    run_podman run --name $cname --log-driver json-file -t $IMAGE echo hi

    # Hand-craft a log file with partial lines and carriage returns
    run_podman inspect --format '{{.HostConfig.LogConfig.Path}}' $cname
    logpath="$output"
    timestamp=$(awk '{print $1}' <"$logpath")
    cr=$'\r'
    nl=$'\n'
    cat >| $logpath <<EOF
$timestamp stdout F podman1$cr
$timestamp stdout P podman2
$timestamp stdout F $cr
$timestamp stdout F podman3$cr
EOF

    expect1="podman3${cr}"
    expect2="podman2${cr}${nl}podman3${cr}"
    expect3="podman1${cr}${nl}podman2${cr}${nl}podman3${cr}"

    # This always worked
    run_podman logs --tail 1 $cname
    assert "$output" = "$expect1" "--tail 1"

    # Prior to this PR, the first line would be "^M" without the podman
    run_podman logs --tail 2 $cname
    assert "$output" = "$expect2" "--tail 2"

    # Confirm that we won't overrun
    for i in 3 4 5; do
        run_podman logs --tail $i $cname
        assert "$output" = "$expect3" "--tail $i"
    done

    run_podman rm $cname
}

@Luap99
Copy link
Member

Luap99 commented Aug 8, 2023

Thanks for the test, I will also add unit tests as there seems to be quite a few different cases that all need to be covered.

Also note that the \r is only a symptom which likely results in the partial lines in the first place but the partial line handling is broken no matter what.

Luap99 added a commit to Luap99/libpod that referenced this issue Aug 9, 2023
There is a problem where our tail code does not handles correctly
partial log lines. This makes podman logs --tail output possibly
incorrect lines when k8s-file is used.

This manifests as flake in CI because partial lines are only sometimes
written, basically always when the output is flushed before writing a
newline.

For our code we must not count partial lines which was already done but
the important thing we must keep reading backwards until the next full
(F) line. This is because all partial (P) lines still must be added to
the full line. See the added tests for details on how the log file looks
like.

While fixing this, I rework the tail logic a bit, there is absolutely no
reason to read the lines in a separate goroutine just to pass the lines
back via channel. We can do this in the same routine.
The logic is very simple, read the lines backwards, append lines to
result and then at the end invert the result slice as tail must return
the lines in the correct order. This more efficient then having to
allocate two different slices or to prepend the line as this would
require a new allocation for each line.

Lastly the readFromLogFile() function wrote the lines back to the log
line channel in the same routine as the log lines we read, this was bad
and causes a deadlock when the returned lines are bigger than the
channel size. There is no reason to allocate a big channel size we can
just write the log lines in a different goroutine, in this case the main
routine were read the logs anyway.

A new system test and unit tests have been added to check corner cases.

Fixes containers#19545

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Nov 8, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 8, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
flakes Flakes from Continuous Integration kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants