Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

untarring sonobuoy retrieve tarball sometimes results in EOF in archive error #572

Closed
waynr opened this issue Jan 23, 2019 · 4 comments · Fixed by #584
Closed

untarring sonobuoy retrieve tarball sometimes results in EOF in archive error #572

waynr opened this issue Jan 23, 2019 · 4 comments · Fixed by #584
Assignees
Labels
kind/bug Behavior isn't as expected or intended p2-moderate
Milestone

Comments

@waynr
Copy link
Contributor

waynr commented Jan 23, 2019

What steps did you take and what happened:
I have a script used in CI that:

  • runs sonobuoy
  • waits for its status to become complete
  • retrieves the results tarball into an output directory and lets the user know where it is
  • dumps the e2e test summary to stdout, ie

SUCCESS! -- 136 Passed | 0 Failed | 0 Pending | 860 Skipped 

Occasionally the last step fails with the following output:

gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now

Which suggests to me that sonobuoy retrieve is not waiting for the tarball to be complete before pulling it locally.

What did you expect to happen:
I expect to be able to untar the tarball and grep files of interest to display results.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Seems potentially related to the following two issues:

Environment:

  • Sonobuoy version: v0.13.0
  • Kubernetes version: (use kubectl version):
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.5", GitCommit:"753b2dbc622f5cc417845f0ff8a77f539a4213ea", GitTreeState:"clean", BuildDate:"2018-11-26T14:31:35Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
  • Kubernetes installer & version: kubespray master branch
  • Cloud provider or hardware configuration: openstack private cloud
  • OS (e.g. from /etc/os-release):
 Machine ID:                 d4a9d15df9ad4f12993ffaff95800d0c
 System UUID:                D4A9D15D-F9AD-4F12-993F-FAFF95800D0C
 Boot ID:                    b09c3841-1bba-479e-9fcd-4112497bb6ae
 Kernel Version:             4.14.88-coreos
 OS Image:                   Container Linux by CoreOS 1967.3.0 (Rhyolite)
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://18.6.1
 Kubelet Version:            v1.11.5
 Kube-Proxy Version:         v1.11.5
@timothysc
Copy link
Contributor

/assign @chuckha

This isn't actually a problem with sonobuoy but an issue with streaming to tar and how messages are propagated.

This has existed for quite some time.

@timothysc timothysc added this to the v1.0.0 milestone Jan 23, 2019
@waynr
Copy link
Contributor Author

waynr commented Jan 23, 2019

@timothysc can you recommend a workaround to deterministically download a correct results tarball? Maybe just keep downloading and extracting it locally until the tar extract command succeeds?

@johnSchnake
Copy link
Contributor

I'm having trouble finding a reference to the streaming tar/msg propagation issue you mention. Could you provide a link or some more details to make it searchable?

I had thought it had to do with simply having a large tarball so I was using a custom kube-conformance image to test that theory but haven't been able to repro locally. It seemed like I would get this often though whenever I was using a cluster in the cloud and had networking/latency to contend with.

@johnSchnake
Copy link
Contributor

johnSchnake commented Feb 8, 2019

In my analysis so far I've now been able to repro this and I think I know the issue, but a sanity check is good.

It seems just that the root issue is that the sonobuoy status command is really just relying on the pod annoations for the plugins to report that they are done. When the plugs report done, the status command will report done.

The issue is that once they report done, the master node still compresses everything into a tar.gz which takes time if it is large.

I was able to repro this by using a custom kube-conformance image (schnake/kube-conformance:big) which creates a large file of results (1.9G). As soon as the status said complete, I tried to retrieve the results. A few times I just got a missing file error, then the first time that it worked the file was corrupted as reported here.

So it seems like the fix is that the master itself needs to report another status that we monitor indicating if the roundup/prep of the results is really ready or not.

@johnSchnake johnSchnake self-assigned this Feb 10, 2019
@johnSchnake johnSchnake added kind/bug Behavior isn't as expected or intended lifecycle/active Actively being worked on labels Feb 12, 2019
@johnSchnake johnSchnake removed the lifecycle/active Actively being worked on label Feb 20, 2019
noris-bot pushed a commit to noris-network/koris that referenced this issue Nov 6, 2019
A work around an issue in sonobuoy, retry to fetch the results.

See vmware-tanzu/sonobuoy#572
https://gitlab.noris.net/PI/koris/-/jobs/52929

Also fix a small issue in the code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Behavior isn't as expected or intended p2-moderate
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants