Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBaseRPC errors after hbase drive crashes. #140

Open
jonbonazza opened this issue Jun 23, 2016 · 1 comment
Open

HBaseRPC errors after hbase drive crashes. #140

jonbonazza opened this issue Jun 23, 2016 · 1 comment
Labels

Comments

@jonbonazza
Copy link

This isn't really an issue with asynchbase, but more of a request for support on an issue we are seeing in our dev environment when using this library. I apologize if this isn't the best place to put this. If it isn't, please feel free to close it and point me in the right direction. Thanks

Anyway, we experienced a drive crash in one of our hbase nodes and since then, we started to get some expected timeout errors and such. Nothing out of the ordinary. When the we recovered the drives, however, we restarted our hbase clients and now, we are seeing the following errors any time we try to access hbase:
ERROR [2016-06-23 17:31:56,779] org.hbase.async.HBaseRpc: Receieved a timeout handle HashedWheelTimeout(deadline: 3979557 ns ago, task: org.hbase.async.HBaseRpc$TimeoutTask@1a92f792) that doesn't match our own org.hbase.async.HBaseRpc$TimeoutTask@1a92f792
ERROR [2016-06-23 17:31:56,780] org.hbase.async.RegionClient: Removed the wrong RPC null when we meant to remove Exists(table=, key=, family=null, qualifiers=null, attempt=12, region=RegionInfo(table=, region_name=",,1456448830516.172da4965e60d3186998c6e07af2f6c0.", stop_key=""))
WARN [2016-06-23 17:31:56,791] org.hbase.async.HBaseClient: Probe Exists(table=, key=, family=null, qualifiers=null, attempt=0, region=RegionInfo(table=, region_name=",,1456448830516.172da4965e60d3186998c6e07af2f6c0.", stop_key="")) failed
! org.hbase.async.RpcTimedOutException: RPC ID [12] timed out waiting for response from HBase on region client [RegionClient@1325442575(chan=[id: 0x90e43262, / => ], #pending_rpcs=0, #batched=0, #rpcs_inflight=0) ] for over 15000ms
! at org.hbase.async.HBaseRpc$TimeoutTask.run(HBaseRpc.java:618) [metrics-drop.jar:3.0.1]
! at org.jboss.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:556) [metrics-drop.jar:3.0.1]
! at org.jboss.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:632) [metrics-drop.jar:3.0.1]
! at org.jboss.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:369) [metrics-drop.jar:3.0.1]
! at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [metrics-drop.jar:3.0.1]
! at java.lang.Thread.run(Thread.java:745) [na:1.8.0_92]
...
...
...

Does this mean our data is corrupted?

Is there some way to recover from this?

Luckily this ocurred in our dev env and not in production, so we have a little more liberty to play with the data, but we'd like to understad what, exactly, is going on in case this ever does occur in production.

Thanks in advance.

@manolama
Copy link
Member

Hm, no this particular exception means that the HBase server wasn't answering RPCs quickly enough. These should disappear once HBase is running in a healthy manner but if you keep seeing the timeouts then see if RPCs are in the HBase queues for a long time (look at process call time, etc)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants