Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARM CI failing #169

Closed
Pennyzct opened this issue Jun 3, 2019 · 10 comments
Closed

ARM CI failing #169

Pennyzct opened this issue Jun 3, 2019 · 10 comments
Labels
bug Incorrect behaviour

Comments

@Pennyzct
Copy link
Contributor

Pennyzct commented Jun 3, 2019

So sorry for the disturbance from the new ARM CI failing, 😭😭
stderr: qemu-system-aarch64: This host does not support 42b IPA: maxram/slots options not usable
I forget to update kernel to the stable v5.0.x to support new feature like NVDIMM, etc.
I will update the kernel ASAP. ;)
In case blocking PR merged, I create this issue to leave a hint.
@jodh-intel @grahamwhaley @chavafg @devimc
runtime#1660 tests/#1646

@Pennyzct Pennyzct added the bug Incorrect behaviour label Jun 3, 2019
@devimc
Copy link

devimc commented Jun 3, 2019

thanks @Pennyzct

@chavafg
Copy link
Contributor

chavafg commented Jun 4, 2019

Hi @Pennyzct
We are now also seeing this issue on the arm nodes:

10:10:36 Building remotely on arm02_slave (arm-ubuntu-1804 arm_node) in workspace /home/jenkins/workspace/kata-containers-runtime-ARM-18.04-PR
10:10:36 [WS-CLEANUP] Deleting project workspace...
10:10:36 [WS-CLEANUP] Deferred wipeout is used...
10:11:38 ERROR: [WS-CLEANUP] Cannot delete workspace: Unable to delete '/home/jenkins/workspace/kata-containers-runtime-ARM-18.04-PR'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts.
10:11:39 ERROR: Cannot delete workspace: Unable to delete '/home/jenkins/workspace/kata-containers-runtime-ARM-18.04-PR'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts.
10:11:39 Performing Post build task...
10:11:39 Match found for :.* : True
10:11:39 Logical operation result is TRUE
10:11:39 Running script  : #!/bin/bash
10:11:39 
10:11:39 export GOPATH=$WORKSPACE/go
10:11:39 export GOROOT="/usr/local/go"
10:11:39 export PATH=${GOPATH}/bin:/usr/local/go/bin:/usr/sbin:/usr/local/bin:${PATH}
10:11:39 
10:11:39 cd $GOPATH/src/github.com/kata-containers/tests
10:11:39 .ci/teardown.sh "$WORKSPACE/artifacts"
10:11:39 
10:11:39 # And ensure the workspace tree is all owned by us before we quit, otherwise the later
10:11:39 # `delete workspace before run` may fail on file perms (as some tests leave things owned by root).
10:11:39 sudo chown -R ${USER} ${WORKSPACE}
10:11:39 sudo chgrp -R $(id -ng) ${WORKSPACE}
10:11:39 FATAL: Unable to produce a script file
10:11:39 java.io.IOException: Read-only file system
10:11:39 	at java.io.UnixFileSystem.createFileExclusively(Native Method)
10:11:39 	at java.io.File.createTempFile(File.java:2090)
10:11:39 	at hudson.FilePath$CreateTextTempFile.invoke(FilePath.java:1466)

@grahamwhaley
Copy link
Contributor

I think the first part of that error (cannot remove workspace) is probably a symptom of the last part (Unable to produce a script file) - as that chown/chgrp script's job is to ensure the perms on the filesystem are OK in order for the workspace to be deleted on the next run.

Now, why it thinks it is trying to write to an RO filesystem...? - maybe it ran out of space, maybe we got a messed up WORKSPACE var or something. most odd.

@jodh-intel
Copy link
Contributor

Haven't we seen those sort of errors for ppc64 in the past? /cc @nitkon.

@Pennyzct
Copy link
Contributor Author

Pennyzct commented Jun 6, 2019

Hi~ @chavafg @grahamwhaley @jodh-intel
arm-testing-1(147.75.95.22) has already been upgraded. It should be working well. ;)

root@arm-testing-1:~# uname -r
5.1.7

I will upgrade another machine ASAP and let you updated. ;)

@jodh-intel
Copy link
Contributor

\o/ ;)

@Pennyzct
Copy link
Contributor Author

Pennyzct commented Jun 6, 2019

Hi~ @chavafg @grahamwhaley @jodh-intel
arm-testing-2(147.75.107.110) has also been upgraded.

root@arm-testing-2:~# uname -r
5.1.7

All Arm CI nodes should be working well. ;) Let's find a PR to test it.

@chavafg
Copy link
Contributor

chavafg commented Jun 6, 2019

Hi @Pennyzct, thanks for having them ready.

I see now that the nodes are able to run kata. But now the jobs seem to consistently fail on the dmesg log test from docker suite.

09:22:27 • Failure [6.683 seconds]
09:22:27 check dmesg logs errors
09:22:27 /home/jenkins/workspace/kata-containers-runtime-ARM-18.04-PR/go/src/github.com/kata-containers/tests/integration/docker/run_test.go:277
09:22:27   Run to check dmesg log errors
09:22:27   /home/jenkins/workspace/kata-containers-runtime-ARM-18.04-PR/go/src/github.com/kata-containers/tests/integration/docker/run_test.go:294
09:22:27     should be empty [It]
09:22:27     /home/jenkins/workspace/kata-containers-runtime-ARM-18.04-PR/go/src/github.com/kata-containers/tests/integration/docker/run_test.go:295
09:22:27 
09:22:27     Expected
09:22:27         <string>: [    0.000000] systemd[61]: chronyd.service: Failed to set up mount namespacing: No such file or directory
09:22:27         [    0.000000] systemd[61]: chronyd.service: Failed at step NAMESPACE spawning /usr/sbin/chronyd: No such file or directory
09:22:27     to be empty

From: http://jenkins.katacontainers.io/job/kata-containers-runtime-ARM-18.04-PR/725/console

any idea? maybe a change that was added while the nodes were having issues?

@Pennyzct
Copy link
Contributor Author

Hi @chavafg
I think that the fail on the dmesg log test from docker suite is because that chronyd service is not working well on guest on AArch64, since kvm-ptp is not supported on AArch64 for now.
We have did a brief discussion on issue runtime/#1279 to clarify the current status on AArch64.
And I have noticed that @amshinde has already pushed a PR osbuilder/#265 to fix this problem, to have chrony run only if the device /dev/ptp0 created by kvm-ptp exists.
And I have tested this PR on AArch64, it should solve dmesg log test failure.

$ ./ginkgo -failFast -v -focus "check dmesg logs errors" ./integration/docker/ -- -runtime=kata-runtime -timeout=120
......
• [SLOW TEST:7.535 seconds]
check dmesg logs errors
/root/go/src/github.com/kata-containers/tests/integration/docker/run_test.go:277
  Run to check dmesg log errors
  /root/go/src/github.com/kata-containers/tests/integration/docker/run_test.go:294
    should be empty
    /root/go/src/github.com/kata-containers/tests/integration/docker/run_test.go:295
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
Ran 1 of 249 Specs in 146.651 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 248 Skipped PASS

@Pennyzct
Copy link
Contributor Author

Hi~ guys @chavafg @grahamwhaley @jodh-intel
ARM CI is working well now~ 🎉🎊
one proof in one new PR runtime/#1823, shining green (/ω\)
I will close this issue now~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behaviour
Projects
None yet
Development

No branches or pull requests

5 participants