Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor hadoop exceptions handling/logging #843

Open
tanoshko opened this issue Dec 21, 2017 · 2 comments
Open

Poor hadoop exceptions handling/logging #843

tanoshko opened this issue Dec 21, 2017 · 2 comments

Comments

@tanoshko
Copy link

tanoshko commented Dec 21, 2017

Issue

Various issues connected with hadoop periodically occur, due to poor logging it is difficult to investigate what happened and if it is critical for the testing result.

Please see examples below.

12:32:52 ERROR 12:32:52,095 [Thread-0] FSNamesystem | FSNamesystem initialization failed.
12:32:52 org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /home/macys/runned_jagger/jaggerworkspace/master/storage/hdfs/namedir is in an inconsistent state: storage directory does not exist or is not accessible.
12:32:52 at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:303)
12:32:52 at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100)
12:32:52 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388)
12:32:52 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:362)
12:32:52 at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276)


12:44:10 ERROR 12:44:10,585 [Thread-0] DFSClient | Exception closing file /user/macys/24/task-2/DURATION/aggregated.dat : java.io.IOException: Call to nfr-customer-jagger.c.ace-tranquility-749.internal/172.17.1.235:8020 failed on local exception: java.io.EOFException
12:44:10 java.io.IOException: Call to nfr-customer-jagger.c.ace-tranquility-749.internal/172.17.1.235:8020 failed on local exception: java.io.EOFException
12:44:10 at org.apache.hadoop.ipc.Client.wrapException(Client.java:1107)
12:44:10 at org.apache.hadoop.ipc.Client.call(Client.java:1075)
12:44:10 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
12:44:10 at com.sun.proxy.$Proxy36.complete(Unknown Source)
12:44:10 at sun.reflect.GeneratedMethodAccessor39.invoke(Unknown Source)
12:44:10 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
12:44:10 at java.lang.reflect.Method.invoke(Method.java:497)
12:44:10 at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
12:44:10 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
12:44:10 at com.sun.proxy.$Proxy36.complete(Unknown Source)
12:44:10 at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3897)
12:44:10 at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3812)
12:44:10 at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:1345)
12:44:10 at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:275)
12:44:10 at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:328)
12:44:10 at com.griddynamics.jagger.storage.fs.hdfs.HDFSClientBean.close(HDFSClientBean.java:74)
12:44:10 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
12:44:10 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
12:44:10 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
12:44:10 at java.lang.reflect.Method.invoke(Method.java:497)


15:04:55 ERROR 15:04:55,582 [IPC Server handler 1 on 8020] UserGroupInformation | PriviledgedActionException as:macys cause:java.io.IOException: File /user/macys/493/task-2/DURATION/KERNEL--1620237522 [172.17.2.9] could only be replicated to 0 nodes, instead of 1
15:04:55 Aug 15, 2016 3:04:55 PM com.google.common.io.Closeables close
15:04:55 WARNING: IOException thrown while closing Closeable.
15:04:55 org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/macys/493/task-2/DURATION/KERNEL--1620237522 [172.17.2.9] could only be replicated to 0 nodes, instead of 1
15:04:55 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
15:04:55 at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
15:04:55 at sun.reflect.GeneratedMethodAccessor48.invoke(Unknown Source)
15:04:55 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
15:04:55 at java.lang.reflect.Method.invoke(Method.java:597)
15:04:55 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
15:04:55 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
15:04:55 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
15:04:55 at java.security.AccessController.doPrivileged(Native Method)
15:04:55 at javax.security.auth.Subject.doAs(Subject.java:396)
15:04:55 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
15:04:55 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
15:04:55
15:04:55 at org.apache.hadoop.ipc.Client.call(Client.java:1070)
15:04:55 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
15:04:55 at com.sun.proxy.$Proxy25.addBlock(Unknown Source)
15:04:55 at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source)
15:04:55 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
15:04:55 at java.lang.reflect.Method.invoke(Method.java:597)
15:04:55 at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
15:04:55 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
15:04:55 at com.sun.proxy.$Proxy25.addBlock(Unknown Source)
15:04:55 at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3510)
15:04:55 at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3373)
15:04:55 at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2589)
15:04:55 at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2829)
15:04:55 Aug 15, 2016 3:04:55 PM com.google.common.io.Closeables close
[8/15/16, 6:29:28 PM] Roman Kishchenko: 15:04:58 ERROR 15:04:58,411 [IPC Server handler 6 on 8020] UserGroupInformation | PriviledgedActionException as:macys cause:org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /user/macys/493/task-2/DURATION/KERNEL--1620237522 [172.17.2.9] for DFSClient_1558558704 on client 172.17.2.9 because current leaseholder is trying to recreate file.

@tanoshko
Copy link
Author

jagger_master.log
Attaching master log with java.io.EOFException at 15:12:10,683

@tanoshko
Copy link
Author

tanoshko commented Jan 4, 2018

jagger_master.log
Attaching master log with java.io.EOFException at 15:12:10,683

newer one jagger_eof.log

Configuration: 2 kernels, jagger properties file below
zeus.environment_sharable.properties.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant