Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frequent bad error MAC failures on ubuntu1604-arm64 #842

Closed
Trott opened this issue Aug 20, 2017 · 19 comments
Closed

Frequent bad error MAC failures on ubuntu1604-arm64 #842

Trott opened this issue Aug 20, 2017 · 19 comments

Comments

@Trott
Copy link
Member

Trott commented Aug 20, 2017

https://ci.nodejs.org/job/node-test-commit-arm/11637/nodes=ubuntu1604-arm64/console

FATAL: command execution failed
javax.crypto.BadPaddingException: bad record MAC
	at sun.security.ssl.EngineInputRecord.decrypt(EngineInputRecord.java:238)
	at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:974)
Caused: javax.net.ssl.SSLException: bad record MAC
	at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)
	at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1728)
	at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:981)
	at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:907)
	at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:781)
	at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624)
	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processRead(SSLEngineFilterLayer.java:347)
	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecv(SSLEngineFilterLayer.java:117)
	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecv(ProtocolStack.java:669)
	at org.jenkinsci.remoting.protocol.NetworkLayer.onRead(NetworkLayer.java:136)
	at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:160)
	at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:721)
	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused: java.io.IOException: Backing channel 'JNLP4-connect connection from 147.75.203.102/147.75.203.102:51858' is disconnected.
	at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:192)
	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:257)
	at com.sun.proxy.$Proxy91.isAlive(Unknown Source)
	at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1043)
	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1035)
	at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
	at org.jenkinsci.plugins.conditionalbuildstep.BuilderChain.perform(BuilderChain.java:71)
	at org.jenkins_ci.plugins.run_condition.BuildStepRunner$2.run(BuildStepRunner.java:110)
	at org.jenkins_ci.plugins.run_condition.BuildStepRunner$Fail.conditionalRun(BuildStepRunner.java:154)
	at org.jenkins_ci.plugins.run_condition.BuildStepRunner.perform(BuildStepRunner.java:105)
	at org.jenkinsci.plugins.conditionalbuildstep.ConditionalBuilder.perform(ConditionalBuilder.java:134)
	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:779)
	at hudson.model.Build$BuildExecution.build(Build.java:206)
	at hudson.model.Build$BuildExecution.doRun(Build.java:163)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:534)
	at hudson.model.Run.execute(Run.java:1728)
	at hudson.matrix.MatrixRun.run(MatrixRun.java:146)
	at hudson.model.ResourceController.execute(ResourceController.java:98)
	at hudson.model.Executor.run(Executor.java:405)

https://ci.nodejs.org/job/node-test-commit-arm/11642/nodes=ubuntu1604-arm64/console:

FATAL: command execution failed
javax.crypto.BadPaddingException: bad record MAC
	at sun.security.ssl.EngineInputRecord.decrypt(EngineInputRecord.java:238)
	at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:974)
Caused: javax.net.ssl.SSLException: bad record MAC
	at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)
	at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1728)
	at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:981)
	at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:907)
	at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:781)
	at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624)
	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processRead(SSLEngineFilterLayer.java:347)
	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecv(SSLEngineFilterLayer.java:117)
	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecv(ProtocolStack.java:669)
	at org.jenkinsci.remoting.protocol.NetworkLayer.onRead(NetworkLayer.java:136)
	at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:160)
	at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:721)
	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused: java.io.IOException: Backing channel 'JNLP4-connect connection from 147.75.111.186/147.75.111.186:48094' is disconnected.
	at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:192)
	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:257)
	at com.sun.proxy.$Proxy91.isAlive(Unknown Source)
	at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1043)
	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1035)
	at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
	at org.jenkinsci.plugins.conditionalbuildstep.BuilderChain.perform(BuilderChain.java:71)
	at org.jenkins_ci.plugins.run_condition.BuildStepRunner$2.run(BuildStepRunner.java:110)
	at org.jenkins_ci.plugins.run_condition.BuildStepRunner$Fail.conditionalRun(BuildStepRunner.java:154)
	at org.jenkins_ci.plugins.run_condition.BuildStepRunner.perform(BuildStepRunner.java:105)
	at org.jenkinsci.plugins.conditionalbuildstep.ConditionalBuilder.perform(ConditionalBuilder.java:134)
	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:779)
	at hudson.model.Build$BuildExecution.build(Build.java:206)
	at hudson.model.Build$BuildExecution.doRun(Build.java:163)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:534)
	at hudson.model.Run.execute(Run.java:1728)
	at hudson.matrix.MatrixRun.run(MatrixRun.java:146)
	at hudson.model.ResourceController.execute(ResourceController.java:98)
	at hudson.model.Executor.run(Executor.java:405)

https://ci.nodejs.org/job/node-test-commit-arm/11634/nodes=ubuntu1604-arm64/console

FATAL: command execution failed
javax.crypto.BadPaddingException: bad record MAC
	at sun.security.ssl.EngineInputRecord.decrypt(EngineInputRecord.java:238)
	at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:974)
Caused: javax.net.ssl.SSLException: bad record MAC
	at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)
	at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1728)
	at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:981)
	at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:907)
	at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:781)
	at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624)
	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processRead(SSLEngineFilterLayer.java:347)
	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecv(SSLEngineFilterLayer.java:117)
	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecv(ProtocolStack.java:669)
	at org.jenkinsci.remoting.protocol.NetworkLayer.onRead(NetworkLayer.java:136)
	at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:160)
	at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:721)
	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused: java.io.IOException: Backing channel 'JNLP4-connect connection from 147.75.111.186/147.75.111.186:41714' is disconnected.
	at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:192)
	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:257)
	at com.sun.proxy.$Proxy91.isAlive(Unknown Source)
	at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1043)
	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1035)
	at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
	at org.jenkinsci.plugins.conditionalbuildstep.BuilderChain.perform(BuilderChain.java:71)
	at org.jenkins_ci.plugins.run_condition.BuildStepRunner$2.run(BuildStepRunner.java:110)
	at org.jenkins_ci.plugins.run_condition.BuildStepRunner$Fail.conditionalRun(BuildStepRunner.java:154)
	at org.jenkins_ci.plugins.run_condition.BuildStepRunner.perform(BuildStepRunner.java:105)
	at org.jenkinsci.plugins.conditionalbuildstep.ConditionalBuilder.perform(ConditionalBuilder.java:134)
	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:779)
	at hudson.model.Build$BuildExecution.build(Build.java:206)
	at hudson.model.Build$BuildExecution.doRun(Build.java:163)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:534)
	at hudson.model.Run.execute(Run.java:1728)
	at hudson.matrix.MatrixRun.run(MatrixRun.java:146)
	at hudson.model.ResourceController.execute(ResourceController.java:98)
	at hudson.model.Executor.run(Executor.java:405)
@rvagg
Copy link
Member

rvagg commented Aug 20, 2017

I think we should hand out badges for finding new errors. This one's a doozy so well done @Trott.

This machine is in their Sunnyvale datacenter, I think that's their original one and we've had a bunch of problems in there. I'm going to shut this machine down and reprovision elsewhere.

@rvagg rvagg closed this as completed Aug 20, 2017
@rvagg
Copy link
Member

rvagg commented Aug 21, 2017

all done

@Trott
Copy link
Member Author

Trott commented Aug 21, 2017

Unfortunately, that doesn't look like it fixed it. It's still happening.

https://ci.nodejs.org/job/node-test-commit-arm/11659/nodes=ubuntu1604-arm64/console

gyp FATAL: command execution failed
javax.crypto.BadPaddingException: bad record MAC
	at sun.security.ssl.EngineInputRecord.decrypt(EngineInputRecord.java:238)
	at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:974)
Caused: javax.net.ssl.SSLException: bad record MAC
	at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)
	at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1728)
	at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:981)
	at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:907)
	at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:781)
	at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624)
	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processRead(SSLEngineFilterLayer.java:347)
	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecv(SSLEngineFilterLayer.java:117)
	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecv(ProtocolStack.java:669)
	at org.jenkinsci.remoting.protocol.NetworkLayer.onRead(NetworkLayer.java:136)
	at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:160)
	at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:721)
	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused: java.io.IOException: Backing channel 'JNLP4-connect connection from 147.75.105.54/147.75.105.54:48564' is disconnected.
	at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:192)
	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:257)
	at com.sun.proxy.$Proxy91.isAlive(Unknown Source)
	at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1043)
	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1035)
	at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
	at org.jenkinsci.plugins.conditionalbuildstep.BuilderChain.perform(BuilderChain.java:71)
	at org.jenkins_ci.plugins.run_condition.BuildStepRunner$2.run(BuildStepRunner.java:110)
	at org.jenkins_ci.plugins.run_condition.BuildStepRunner$Fail.conditionalRun(BuildStepRunner.java:154)
	at org.jenkins_ci.plugins.run_condition.BuildStepRunner.perform(BuildStepRunner.java:105)
	at org.jenkinsci.plugins.conditionalbuildstep.ConditionalBuilder.perform(ConditionalBuilder.java:134)
	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:779)
	at hudson.model.Build$BuildExecution.build(Build.java:206)
	at hudson.model.Build$BuildExecution.doRun(Build.java:163)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:534)
	at hudson.model.Run.execute(Run.java:1728)
	at hudson.matrix.MatrixRun.run(MatrixRun.java:146)
	at hudson.model.ResourceController.execute(ResourceController.java:98)
	at hudson.model.Executor.run(Executor.java:405)

https://ci.nodejs.org/job/node-test-commit-arm/11654/nodes=ubuntu1604-arm64/console

FATAL: command execution failed
javax.crypto.BadPaddingException: bad record MAC
	at sun.security.ssl.EngineInputRecord.decrypt(EngineInputRecord.java:238)
	at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:974)
Caused: javax.net.ssl.SSLException: bad record MAC
	at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)
	at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1728)
	at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:981)
	at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:907)
	at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:781)
	at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624)
	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processRead(SSLEngineFilterLayer.java:347)
	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecv(SSLEngineFilterLayer.java:117)
	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecv(ProtocolStack.java:669)
	at org.jenkinsci.remoting.protocol.NetworkLayer.onRead(NetworkLayer.java:136)
	at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:160)
	at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:721)
	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused: java.io.IOException: Backing channel 'JNLP4-connect connection from 147.75.111.186/147.75.111.186:56790' is disconnected.
	at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:192)
	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:257)
	at com.sun.proxy.$Proxy91.isAlive(Unknown Source)
	at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1043)
	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1035)
	at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
	at org.jenkinsci.plugins.conditionalbuildstep.BuilderChain.perform(BuilderChain.java:71)
	at org.jenkins_ci.plugins.run_condition.BuildStepRunner$2.run(BuildStepRunner.java:110)
	at org.jenkins_ci.plugins.run_condition.BuildStepRunner$Fail.conditionalRun(BuildStepRunner.java:154)
	at org.jenkins_ci.plugins.run_condition.BuildStepRunner.perform(BuildStepRunner.java:105)
	at org.jenkinsci.plugins.conditionalbuildstep.ConditionalBuilder.perform(ConditionalBuilder.java:134)
	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:779)
	at hudson.model.Build$BuildExecution.build(Build.java:206)
	at hudson.model.Build$BuildExecution.doRun(Build.java:163)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:534)
	at hudson.model.Run.execute(Run.java:1728)
	at hudson.matrix.MatrixRun.run(MatrixRun.java:146)
	at hudson.model.ResourceController.execute(ResourceController.java:98)
	at hudson.model.Executor.run(Executor.java:405)

@Trott Trott reopened this Aug 21, 2017
@rvagg
Copy link
Member

rvagg commented Aug 22, 2017

OK, that's pretty strange, entirely new host in a different DC so must be the software stack. What I've done on this host (only) is install the Oracle JDK 8 and uninstalled the OpenJDK 8. Cross fingers I guess.
Best leave this open and monitor, if this works then we'll need to do the same on the matching machine. If it doesn't work then ... I dunno! Will find some knob to twiddle I guess.

@Trott
Copy link
Member Author

Trott commented Aug 25, 2017

Here's the most recent one. Not sure if this is on a machine we hope is fixed or a machine that we will apply the fix to?

https://ci.nodejs.org/job/node-test-commit-arm/11730/nodes=ubuntu1604-arm64/console

FATAL: command execution failed
javax.crypto.BadPaddingException: bad record MAC
	at sun.security.ssl.EngineInputRecord.decrypt(EngineInputRecord.java:238)
	at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:974)
Caused: javax.net.ssl.SSLException: bad record MAC
	at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)
	at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1728)
	at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:981)
	at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:907)
	at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:781)
	at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624)
	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processRead(SSLEngineFilterLayer.java:347)
	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecv(SSLEngineFilterLayer.java:117)
	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecv(ProtocolStack.java:669)
	at org.jenkinsci.remoting.protocol.NetworkLayer.onRead(NetworkLayer.java:136)
	at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:160)
	at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:721)
	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused: java.io.IOException: Backing channel 'JNLP4-connect connection from 147.75.111.186/147.75.111.186:50334' is disconnected.
	at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:192)
	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:257)
	at com.sun.proxy.$Proxy91.isAlive(Unknown Source)
	at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1043)
	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1035)
	at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
	at org.jenkinsci.plugins.conditionalbuildstep.BuilderChain.perform(BuilderChain.java:71)
	at org.jenkins_ci.plugins.run_condition.BuildStepRunner$2.run(BuildStepRunner.java:110)
	at org.jenkins_ci.plugins.run_condition.BuildStepRunner$Fail.conditionalRun(BuildStepRunner.java:154)
	at org.jenkins_ci.plugins.run_condition.BuildStepRunner.perform(BuildStepRunner.java:105)
	at org.jenkinsci.plugins.conditionalbuildstep.ConditionalBuilder.perform(ConditionalBuilder.java:134)
	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:779)
	at hudson.model.Build$BuildExecution.build(Build.java:206)
	at hudson.model.Build$BuildExecution.doRun(Build.java:163)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:534)
	at hudson.model.Run.execute(Run.java:1728)
	at hudson.matrix.MatrixRun.run(MatrixRun.java:146)
	at hudson.model.ResourceController.execute(ResourceController.java:98)
	at hudson.model.Executor.run(Executor.java:405)

@refack
Copy link
Contributor

refack commented Oct 14, 2017

@Trott has this cropped up recently?

@Trott
Copy link
Member Author

Trott commented Oct 14, 2017

@refack Not that I've noticed.

@Trott Trott closed this as completed Oct 14, 2017
@joyeecheung
Copy link
Member

Similar issues showed up on ubuntu1604-arm64 again, this time it's in the test phase:

https://ci.nodejs.org/job/node-test-commit-arm/nodes=ubuntu1604-arm64/14299/console

not ok 1270 parallel/test-regress-GH-1531
  ---
  duration_ms: 1.60
  severity: fail
  stack: |-
    listening
    Error: 281472750350336:error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac:../deps/openssl/openssl/ssl/s3_pkt.c:535:
    
  ...

@joyeecheung joyeecheung reopened this Mar 3, 2018
@joyeecheung
Copy link
Member

cc @rvagg

@rvagg
Copy link
Member

rvagg commented Mar 4, 2018

Hah, that's pretty weird! afaik Java doesn't use OpenSSL so we're looking at something deeper here.

@nodejs/crypto if you're looking for an interesting challenge, this might be for you.

For this machine, on top of build 14299 these ones have failed with the same error:

https://ci.nodejs.org/job/node-test-commit-arm/nodes=ubuntu1604-arm64/14307/
https://ci.nodejs.org/job/node-test-commit-arm/nodes=ubuntu1604-arm64/14317/
https://ci.nodejs.org/job/node-test-commit-arm/nodes=ubuntu1604-arm64/14268/

These ones have failed with different crypto errors:

https://ci.nodejs.org/job/node-test-commit-arm/nodes=ubuntu1604-arm64/14300/

not ok 260 parallel/test-crypto-binary-default
  ---
  duration_ms: 1.122
  severity: fail
  stack: |-
    (node:65147) [DEP0091] DeprecationWarning: crypto.DEFAULT_ENCODING is deprecated.
    assert.js:74
      throw new AssertionError(obj);
      ^
    
    AssertionError [ERR_ASSERTION]: '�æ�_]\u0013G\u000e;F§�Ù\u0013e+¸\u000fc�' strictEqual 'ÅäxÕ��ÈAªS\r¶�\\L��(� '
        at common.mustCall (/home/iojs/build/workspace/node-test-commit-arm/nodes/ubuntu1604-arm64/test/parallel/test-crypto-binary-default.js:685:12)
        at /home/iojs/build/workspace/node-test-commit-arm/nodes/ubuntu1604-arm64/test/common/index.js:467:15
        at PBKDF2.next [as ondone] (internal/crypto/pbkdf2.js:83:7)

https://ci.nodejs.org/job/node-test-commit-arm/nodes=ubuntu1604-arm64/14284/console

not ok 968 parallel/test-https-agent-session-reuse
  ---
  duration_ms: 1.4
  severity: fail
  stack: |-
    events.js:116
          throw er; // Unhandled 'error' event
          ^
    
    Error: 281473003442176:error:140940F6:SSL routines:ssl3_read_bytes:unknown alert type:../deps/openssl/openssl/ssl/s3_pkt.c:1508:

The other Ubuntu 16.04 ARM64 machine in CI is much more green but isn't free of crypto failures, there's this one:

https://ci.nodejs.org/job/node-test-commit-arm/nodes=ubuntu1604-arm64/14194/

not ok 1043 parallel/test-https-host-headers
  ---
  duration_ms: 1.101
  severity: fail
  stack: |-
    test https server listening on port 45571
    Got request: localhost:45571 /0
    /home/iojs/build/workspace/node-test-commit-arm/nodes/ubuntu1604-arm64/test/parallel/test-https-host-headers.js:32
      throw er;
      ^
    
    Error: 281472848347136:error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac:../deps/openssl/openssl/ssl/s3_pkt.c:535:

And these two that are different again:

https://ci.nodejs.org/job/node-test-commit-arm/nodes=ubuntu1604-arm64/14316/

not ok 1034 parallel/test-https-agent
  ---
  duration_ms: 1.339
  severity: fail
  stack: |-
    /home/iojs/build/workspace/node-test-commit-arm/nodes/ubuntu1604-arm64/test/parallel/test-https-agent.js:62
              throw e;
              ^
    
    Error: write EPROTO 281473430319104:error:1409441B:SSL routines:ssl3_read_bytes:tlsv1 alert decrypt error:../deps/openssl/openssl/ssl/s3_pkt.c:1500:SSL alert number 51
    281473430319104:error:1409E0E5:SSL routines:ssl3_write_bytes:ssl handshake failure:../deps/openssl/openssl/ssl/s3_pkt.c:659:
    
        at WriteWrap.afterWrite [as oncomplete] (net.js:866:14)

https://ci.nodejs.org/job/node-test-commit-arm/nodes=ubuntu1604-arm64/14164/

not ok 330 parallel/test-crypto-dh-leak
  ---
  duration_ms: 1.289
  severity: fail
  stack: |-
    assert.js:243
        throw err;
        ^
    
    AssertionError [ERR_ASSERTION]: The expression evaluated to a falsy value:
    
      assert(after - before < 5 << 20)
    
        at Object.<anonymous> (/home/iojs/build/workspace/node-test-commit-arm/nodes/ubuntu1604-arm64/test/parallel/test-crypto-dh-leak.js:26:1)
        at Module._compile (module.js:666:30)
        at Object.Module._extensions..js (module.js:677:10)
        at Module.load (module.js:577:32)
        at tryModuleLoad (module.js:517:12)
        at Function.Module._load (module.js:509:3)
        at Function.Module.runMain (module.js:707:10)
        at startup (bootstrap_node.js:196:16)
        at bootstrap_node.js:706:3

And I can't find anything like this in the CentOS 7 ARM64 builds, so it's limited to Ubuntu 16.04. I don't know to interpret that cause it's the same OpenSSL being compiled on both. Compiler difference perhaps? We're on a gcc 4.8.5 for CentOS 7 and 5.4.0 for Ubuntu 16.04.

Not sure where to take this next tbh.

@shigeki
Copy link

shigeki commented Mar 6, 2018

Can I get login the machine? I do not have the arm64 machine.
OpenSSL assember files are platform dependent and they are generated on Linux in upgrading process of OpenSSL. I've never tested on them in a real ARM64 server so it might have some issues.

@rvagg
Copy link
Member

rvagg commented Mar 6, 2018

yep @shigeki, root@147.75.74.174 has your github keys in it. That's the machine that throws these errors up most frequently. I was just in there running parallel/test-regress-GH-1531 in a loop and was getting failures roughly every 200 times. I even got an SSH authentication failure when I tried to log in to it the first time today .. I'm not sure if that's the same but it's certainly fishy.

@shigeki
Copy link

shigeki commented Mar 6, 2018

Thanks. I can login now. I will investigate the issue.

@shigeki
Copy link

shigeki commented Mar 6, 2018

I built openssl-1.0.2n and node and make tests.
OpenSSL tests worked fine and node test-ci had some flaky errors but not related to crypto and TLS.

I also made thousands tls connections between node tls server and client but no errors were found so that I could not reproduce the errors.

As far as checked the size of openssl assembler files between node and openssl-1.0.2n, they are the same size as below.

root@test-packetnet-ubuntu1604-arm64-2:/var/tmp/shigeki/openssl-1.0.2n# find crypto -mtime 0 -name  '*.S' |xargs ls -l
-rw-r--r-- 1 root root 14899 Mar  6 05:14 crypto/aes/aesv8-armx.S
-rw-r--r-- 1 root root  6496 Mar  6 05:14 crypto/modes/ghashv8-armx.S
-rw-r--r-- 1 root root 27982 Mar  6 05:13 crypto/sha/sha1-armv8.S
-rw-r--r-- 1 root root 31754 Mar  6 05:13 crypto/sha/sha256-armv8.S
-rw-r--r-- 1 root root 28126 Mar  6 05:13 crypto/sha/sha512-armv8.S
root@test-packetnet-ubuntu1604-arm64-2:/var/tmp/shigeki/node/deps/openssl/asm# find arm64-linux64-gas -name '*.S' |xargs ls -l
-rw-r--r-- 1 root root 14899 Mar  6 04:56 arm64-linux64-gas/aes/aesv8-armx.S
-rw-r--r-- 1 root root  6496 Mar  6 04:56 arm64-linux64-gas/modes/ghashv8-armx.S
-rw-r--r-- 1 root root 27982 Mar  6 04:56 arm64-linux64-gas/sha/sha1-armv8.S
-rw-r--r-- 1 root root 31754 Mar  6 04:56 arm64-linux64-gas/sha/sha256-armv8.S
-rw-r--r-- 1 root root 28126 Mar  6 04:56 arm64-linux64-gas/sha/sha512-armv8.S

There may be other reasons to cause this issues.

@vielmetti
Copy link

@rvagg can you confirm the IDs of these systems under test that are throwing errors? I will double check the firmware on those systems.

The output of uname -a on the affected machines would be useful too.

@vielmetti
Copy link

Also @rvagg feel free to spin up an additional machine just to loop parallel/test-regress-GH-1531, the one that fails 1 in 200 times. The systems you have are 32GB machines, so grab a 128GB machine as well and see if an otherwise unloaded different system will exhibit the same failures. Would love to have a reproducer that's not on your production environment.

@neemah
Copy link

neemah commented Mar 12, 2018

Hi, we have SSL3_GET_RECORD error, Node 9.5.0, 9.8.0 on Heroku Cedar-14 stack (Ubuntu 14.04)

It's not frequent but couple of times happened during previous week.

screen shot 2018-03-12 at 15 27 31

@vielmetti
Copy link

I'll note this post which reports a similar set of issues to what @neemah reported re Heroku.

https://serverfault.com/questions/859987/im-getting-error-ssl3-get-recorddecryption-failed-or-bad-record-mac

@maclover7
Copy link
Contributor

These machines look to be doing better these days -- going to close out this issue. Please reopen if this requires further action

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants