-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
podman-exec tty resize: stty: standard input #10710
Comments
Yikes - the #10688 flake was a triple: |
@edsantiago Are you able to reproduce locally? |
I have not been able to reproduce, because I'm plagued by #10701 on my laptop. |
I'm struggling to see how this could happen - the resize is now happening in a way that can't race. |
I just reproduced on f34, podman-3.2.1-1.fc34 rootless. Passed on retry. |
Reproduced on same f34 system as above, this time in podman-remote rootless. Does the warning message help?
|
OK, so the resize is happening too soon, presumably? That at least makes sense. |
Recommendation: try |
@Luap99 PTAL |
Well you managed to reproduced this issue with @mheon I believe the race is between podman and conmon. The resize call will simply write to a named pipe and after that podman will write to another pipe to signal conmon that it should start the process. I can imagine a case where conmon reads the start pipe before the resize pipe. Conmon never signals podman if the resize even worked. |
Could we also be looking at an internal latency in Conmon, where the process is not immediately ready after the start pipe is read? Regardless, this sounds like a reasonable conclusion. I don't see an easy way around it, though - Conmon doesn't tell us that the start signal was received, either, so there's no easy way to syncronize |
Any progress on that? It's a quite consistent and stubborn flake in the main branch at the moment. |
As discussed in containers#10710, the additional checks for podman-exec added by commit 666f555 are extremely flaky and appear in nearly every PR I have see this week. Let's temporarily disable the checks and reenable them on containers#10710 is fixed. Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
I opened #10758 to disable the flaky tests. We need to make sure to reenable the tests once the underlying issue is fixed (I made sure to drop a comment). I usually feel strongly about disabling tests but the flakes are just too frequent. |
As discussed in containers#10710, the additional checks for podman-exec added by commit 666f555 are extremely flaky and appear in nearly every PR I have see this week. Let's temporarily disable the checks and reenable them on containers#10710 is fixed. Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
A friendly reminder that this issue had no activity for 30 days. |
@edsantiago @vrothberg Any update on this issue? |
The multiarch testing group is still seeing this. Podman 3.2.3 on s390x; I don't know if it's Fedora or RHEL. If I read the log correctly, August 19. |
Let's wait for the results on v3.3.0. We increased the sizes of signal buffers "in hope" to resolve such issues which made it into v3.3.0. |
A friendly reminder that this issue had no activity for 30 days. |
Well it is a month later, so I am going to assume this is fixed. Reopen if it happens again. |
I've got a 1minutetip VM right now that's reproducing it super-easily. Ping me for access. The "remote" connection (in comments above) was me grasping at straws. I don't know if this merits a new issue. My gut tells me no, that the common factor is "stty is broken", not whether it's via exec or remote. |
My comment is still valid. |
A friendly reminder that this issue had no activity for 30 days. |
I can no longer reproduce this: not with the Is it time to reenable the commented-out |
Ref: containers#10710, a nasty and frequent flake. I can no longer reproduce the failure on f35 or Rawhide, so let's take the risk of reenabling the test. Signed-off-by: Ed Santiago <santiago@redhat.com>
Flaked in the very PR that was going to reintroduce the test. Problem still exists, even if I can't reproduce it on my laptop or a Rawhide VM. |
I guess this can be fixed with conmon-rs in the future. |
A friendly reminder that this issue had no activity for 30 days. |
A friendly reminder that this issue had no activity for 30 days. |
A friendly reminder that this issue had no activity for 30 days. |
A friendly reminder that this issue had no activity for 30 days. |
After a long absence, this is back. rawhide rootless:
|
Yeah at this point I don't think this is ever going to be fixed, we would need a bidirectional channel to get a success response back from conmon and only then start the container. I guess conmon-rs is your best hope here. |
I've seen the stty flake (containers#10710) twice in one day. Time to add a retry. Signed-off-by: Ed Santiago <santiago@redhat.com>
Fallout from #10683 (tty resize). The test even flaked in that very same PR:
sys: podman detects correct tty size
...and it just triggered in a new PR, also ubuntu-2104.
The text was updated successfully, but these errors were encountered: