Skip to content

Conversation

@charliechen211
Copy link

What changes were proposed in this pull request?

https://issues.apache.org/jira/browse/SPARK-20608

How was this patch tested?

Spark-submit script: yarn.spark.access.namenodes=hdfs://namenode01,hdfs://namenode02
Spark Application codes:
dataframe.write.parquet(getActiveNameNode(...) + hdfsPath)

Before this patch:
Exception in thread "main" java.lang.reflect.InvocationTargetException
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category WRITE is not supported in state standby
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1691)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1322)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDelegationToken(FSNamesystem.java:7079)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getDelegationToken(NameNodeRpcServer.java:505)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getDelegationToken(AuthorizationProviderProxyClientProtocol.java:637)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:957)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1623)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

at org.apache.hadoop.ipc.Client.call(Client.java:1411)
at org.apache.hadoop.ipc.Client.call(Client.java:1364)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy14.getDelegationToken(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDelegationToken(ClientNamenodeProtocolTranslatorPB.java:901)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy15.getDelegationToken(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:988)
at org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:1316)
at org.apache.hadoop.fs.FileSystem.collectDelegationTokens(FileSystem.java:529)
at org.apache.hadoop.fs.FileSystem.addDelegationTokens(FileSystem.java:507)
at org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTokens(DistributedFileSystem.java:2002)
at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$$anonfun$obtainTokensForNamenodes$1.apply(YarnSparkHadoopUtil.scala:135)
at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$$anonfun$obtainTokensForNamenodes$1.apply(YarnSparkHadoopUtil.scala:131)
at scala.collection.immutable.Set$Set3.foreach(Set.scala:115)
at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil.obtainTokensForNamenodes(YarnSparkHadoopUtil.scala:131)
at org.apache.spark.deploy.yarn.Client.getTokenRenewalInterval(Client.scala:701)
at org.apache.spark.deploy.yarn.Client.setupLaunchEnv(Client.scala:730)
at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:833)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:167)
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1119)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1175)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
... 7 more

Apply this patch:
Worked!!!

Change-Id: Id0eedfbd594b24d2a3c283a9b5febdb6042c4dd1
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@jerryshao
Copy link
Contributor

Why not submit PR for master branch?

From my understanding, your patch is trying to catch exception and continue to get tokens from others FS, right?

@charliechen211
Copy link
Author

@jerryshao Why not submit PR for master branch?
Sorry, I didnt find yarn module in master branch..If this patch can be accepted, I would spend some time to read codes in master branch. Up to now, I have applied this patch for Spark 2.0.1 and 2.1.0 in our compiled Spark.

From my understanding, your patch is trying to catch exception and continue to get tokens from others FS, right?
Yep, that's right. I dont think it should throw RuntimeException for standby Exception. It doesnt make any bad effect.

@srowen
Copy link
Member

srowen commented May 5, 2017

logWarning(s"Namenode ${dst} is in state standby", e)
case e: RemoteException =>
logWarning(s"Namenode ${dst} is in state standby", e)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HADOOP-13372 implies that swift:// throws an UnknownHostException here; best to catch & log too, in case someone adds swift:// to the list of filesystems.

@charliechen211
Copy link
Author

@srowen thanks. See PR in master branch: #17872

@srowen
Copy link
Member

srowen commented May 5, 2017

You need to close this one @morenn520

@charliechen211 charliechen211 changed the title [SPARK-20608] allow standby namenodes in spark.yarn.access.namenodes [SPARK-20608] allow standby namenodes in spark.yarn.access.namenodes(@deprecated) May 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants