Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix copyTo and copyFrom command in K8 class #57

Closed
1 task
leninmehedy opened this issue Feb 22, 2024 · 8 comments · Fixed by #561 or #566
Closed
1 task

Fix copyTo and copyFrom command in K8 class #57

leninmehedy opened this issue Feb 22, 2024 · 8 comments · Fixed by #561 or #566
Assignees
Labels
Bug A error that causes the feature to behave differently than what was expected based on design docs P1 High priority issue. Required to be completed in the assigned milestone. released

Comments

@leninmehedy
Copy link
Member

leninmehedy commented Feb 22, 2024

I think the way to first try and solve this is by upgrading all of our tools to the latest version of Kuberenetes/Kubectl/Kind /etc so that it will leverage the websocket, which might seem to solve the problem.

K8 › should be able to copy a file to and from a container

    ZlibError: zlib: unexpected end of file

      at Unzip.write (node_modules/minizlib/index.js:154:22)
      at Unzip.flush (node_modules/minizlib/index.js:105:10)
      at Unzip.end (node_modules/minizlib/index.js:111:10)
      at Unpack.end (node_modules/tar/lib/parse.js:544:21)
      at Pipe.end (node_modules/fs-minipass/node_modules/minipass/index.js:75:17)
      at ReadStream.[emitEnd2] (node_modules/fs-minipass/node_modules/minipass/index.js:522:9)
      at ReadStream.[emitEnd] (node_modules/fs-minipass/node_modules/minipass/index.js:507:21)
      at ReadStream.emit (node_modules/fs-minipass/node_modules/minipass/index.js:458:27)
      at ReadStream.emit (node_modules/fs-minipass/index.js:175:22)
      at ReadStream.[maybeEmitEnd] (node_modules/fs-minipass/node_modules/minipass/index.js:440:12)
      at ReadStream.emit (node_modules/fs-minipass/node_modules/minipass/index.js:474:27)
      at ReadStream.emit (node_modules/fs-minipass/index.js:175:22)
      at ReadStream.[resume] (node_modules/fs-minipass/node_modules/minipass/index.js:312:10)
      at Unpack.Pipe.ondrain (node_modules/fs-minipass/node_modules/minipass/index.js:64:37)
      at ReadEntry.<anonymous> (node_modules/tar/lib/parse.js:439:47)
      at ReadEntry.emit (node_modules/tar/node_modules/minipass/index.js:498:23)
      at ReadEntry.[flush] (node_modules/tar/node_modules/minipass/index.js:378:62)
      at ReadEntry.[resume] (node_modules/tar/node_modules/minipass/index.js:337:41)
      at WriteStream.Pipe.ondrain (node_modules/tar/node_modules/minipass/index.js:74:37)
      at WriteStream.emit (node_modules/fs-minipass/index.js:257:18)
      at WriteStream.[_onwrite] (node_modules/fs-minipass/index.js:345:16)
      at node_modules/fs-minipass/index.js:325:21
@leninmehedy leninmehedy added the Bug A error that causes the feature to behave differently than what was expected based on design docs label Feb 22, 2024
@leninmehedy leninmehedy self-assigned this Feb 22, 2024
@leninmehedy leninmehedy added this to Solo Feb 22, 2024
@github-project-automation github-project-automation bot moved this to 🆕 New in Solo Feb 22, 2024
@leninmehedy
Copy link
Member Author

leninmehedy commented Feb 29, 2024

Remove async: https://github.com/hashgraph/solo/blob/f4f8527fc30eb8fe8d7900fdf57158119e2f0181/src/core/k8.mjs#L485C11-L488C14

It seems tar is probably failing to run when we are uploading a large file like build.zip (~90Mb) because of pipe-buffer issue. We could use dd/tee to first write to a remote file and then subsequently untar that file.

Similar issues: kubernetes/kubernetes#60140

@leninmehedy
Copy link
Member Author

The error log in this ticket is confusing because copyTo and copyFrom doesn't use Zlib. However, we need to test thoroughly with a large file and find out why copyTo and copyFrom fails.

@matteriben
Copy link
Contributor

matteriben commented Apr 5, 2024

This may have been fixed in the kubernetes 1.30 release (ga 4/17/24).
kubernetes/kubernetes#60140 (comment)
https://www.kubernetes.dev/resources/release/

Note:

  • This fix was verified in 1.30.0-alpha
  • Both the cluster and kubectl to use version 1.30

@jeromy-cannon jeromy-cannon added the P3 Low priority issue. Will not impact the release schedule if not complete. label May 31, 2024
@jeromy-cannon
Copy link
Contributor

A few more notes. 1.30+ is available. I think we would need to update Kind version, update the readme, and also update the kubernetes node client.

We could also add retry logic, here is the source code of the Java libraries:

@jeromy-cannon jeromy-cannon added P2 Required to be completed in the assigned milestone, but may or may not impact release schedule. P1 High priority issue. Required to be completed in the assigned milestone. and removed P3 Low priority issue. Will not impact the release schedule if not complete. P2 Required to be completed in the assigned milestone, but may or may not impact release schedule. labels Aug 22, 2024
@jeromy-cannon
Copy link
Contributor

@instamenta , on the copyFrom, just drop the z from the tar, we'll merge that, close this issue, then we can reopen this issue if it starts happening again.

@jeromy-cannon
Copy link
Contributor

jeromy-cannon commented Sep 10, 2024

potential future solutions (if we need to reopen this ticket in the future), if we delay, then 1.30.0 deployments to GKE and Latitude will make this issue go away:

  1. tar the file first, then send the bytes back/down and decompress locally (current version is tar'ing with compression and piping to stdout, which feels the buffer and creates instability)
  2. make a list of files, and do each one individually one at a time (without tar)
  3. add retry and resume logic (resume if multiple files)

@jeromy-cannon
Copy link
Contributor

a different error, but copyFrom is failing more frequently

@jeromy-cannon jeromy-cannon reopened this Sep 11, 2024
@jeromy-cannon jeromy-cannon moved this from ✅ Done to 🏗 In progress in Solo Sep 11, 2024
@github-project-automation github-project-automation bot moved this from 🏗 In progress to ✅ Done in Solo Sep 12, 2024
@swirlds-automation
Copy link
Contributor

🎉 This issue has been resolved in version 0.30.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug A error that causes the feature to behave differently than what was expected based on design docs P1 High priority issue. Required to be completed in the assigned milestone. released
Projects
Status: ✅ Done
5 participants