Discussion:
DFS stability running HBase and dfs.datanode.handler.count...
(too old to reply)
Andy Sautins
2011-04-09 23:35:05 UTC
Permalink
I ran across an mailing list posting from 1/4/2009 that seemed to indicate increasing dfs.datanode.handler.count could help improve DFS stability (http://mail-archives.apache.org/mod_mbox/hbase-user/200901.mbox/%3C49605FE0.9040509-+c050i++PmesTnJN9+***@public.gmane.org%3E ). The posting seems to indicate the wiki was updated, but I don't seen anything in the wiki about increasing dfs.datanode.handler.count. I have seen a few other notes that seem to show examples that have raised dfs.datanode.handler.count including one from an IBM article (http://software.intel.com/en-us/articles/hadoop-and-hbase-optimization-for-read-intensive-search-applications/ ) and the Pro Hadoop book, but other than that the only other mention I see is from cloudera seems luke-warm on increasing dfs.datanode.handler.count (http://www.cloudera.com/blog/2009/03/configuration-parameters-what-can-you-just-ignore/ ).

Given the post is from 2009 I thought I'd ask if anyone has had any success improving stability of HBase/DFS when increasing dfs.datanode.handler.count. The specific error we are seeing somewhat frequently ( few hundred times per day ) in the datanode longs is as follows:

2011-04-09 00:12:48,035 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.18.0.33:50010, storageID=DS-1501576934-10.18.0.33-50010-1296248656454, infoPort=50075, ipcPort=50020):DataXceiver
java.io.IOException: Block blk_-163126943925471435_28809750 is not valid.

The above seems to correspond to ClosedChannelExceptions in the hbase regionserver logs as well as some warnings about long write to hlog ( some in the 50+ seconds ).

The biggest end-user facing issue we are seeing is that Task Trackers keep getting blacklisted. It's quite possible our problem is unrelated to anything HBase, but I thought it was worth asking given what we've been seeing.

We are currently running 0.91 on an 18 node cluster with ~3k total regions and each region server is running with 2G of memory.

Any insight would be appreciated.

Thanks

Andy
Stack
2011-04-10 19:15:59 UTC
Permalink
Did you try upping it Andy? Andrew Purtell's recommendation though
old would have come of experience. The Intel article reads like sales
but there is probably merit to its suggestion. The Cloudera article
is more unsure about the effect of upping handlers though it allows
"...could be set a bit higher."

I just looked at our prod frontend and its set to 3 still. I don't
see your exceptions in our DN log.

What version of hadoop? You say hbase 0.91. You mean 0.90.1?

ulimit and nproc are set sufficiently high for hadoop/hbase user?

If you grep 163126943925471435_28809750 in namenode log, do you see a
delete occur before a later open?

St.Ack

On Sat, Apr 9, 2011 at 4:35 PM, Andy Sautins
Post by Andy Sautins
2011-04-09 00:12:48,035 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.18.0.33:50010, storageID=DS-1501576934-10.18.0.33-50010-1296248656454, infoPort=50075, ipcPort=50020):DataXceiver
java.io.IOException: Block blk_-163126943925471435_28809750 is not valid.
  The above seems to correspond to ClosedChannelExceptions in the hbase regionserver logs as well as some warnings about long write to hlog ( some in the 50+ seconds ).
   The biggest end-user facing issue we are seeing is that Task Trackers keep getting blacklisted.  It's quite possible our problem is unrelated to anything HBase, but I thought it was worth asking given what we've been seeing.
  We are currently running 0.91 on an 18 node cluster with ~3k total regions and each region server is running with 2G of memory.
  Any insight would be appreciated.
  Thanks
   Andy
Andy Sautins
2011-04-14 14:24:41 UTC
Permalink
Thanks for the response stack. Yes we tried increasing dfs.datanode.handler.count to 8. At this point I would say it didn't seem to resolve the issue we are seeing, but we it also doesn't seem to be hurting anything so for right now we're going to leave it in at 8 while we continue to debug.

In regard to the original error I posted ( Block 'x' is not valid ) we have chased that down thanks to your suggestion of looking at the logs for the history of the block. It _looks_ like our 'is not valid' block errors are unrelated and due to chmod or deleting mapreduce output directories directly after a run. We are still isolating that but it looks like it's not HBase releated so I'll move that to another list. Thank you very much for your debugging suggestions.

The one issue we are still seeing is that we will occasionally have a regionserver die with the following exception. I need to chase that down a little more but it seems similar to a post from 2/13/2011 (http://www.mail-archive.com/user-50Pas4EWwPEyzMRdD/***@public.gmane.org/msg05550.html ) that I'm not sure was ever resolved or not. If anyone has any insight on how to debug the following error a little more I would appreciate any thoughts you might have.

2011-04-14 06:05:13,001 ERROR org.apache.hadoop.hdfs.DFSClient: Exception closing file /user/hbase/.logs/hd10.dfs.returnpath.net,60020,1302555127291/hd10.dfs.returnpath.net%3A60020.1302781635921 : java.io.IOException: Error Recovery for block blk_1315316969665710488_29842654 failed because recovery from primary datanode 10.18.0.16:50010 failed 6 times. Pipeline was 10.18.0.16:50010. Aborting...
java.io.IOException: Error Recovery for block blk_1315316969665710488_29842654 failed because recovery from primary datanode 10.18.0.16:50010 failed 6 times. Pipeline was 10.18.0.16:50010. Aborting...
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2841)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2305)

Other than the above exception causing a region server to die occasionally everything seems to be working well.

Note we have now upgraded to Cloudera CDH Version 3 Update 0 ( hadoop 0.20.2+923.21 and hbase 0.90.1+15.18 ) and still see the above exception. We do have ulimit set ( memory unlimited and files 32k ) for the user running hbase.

Thanks again for your help

Andy

-----Original Message-----
From: saint.ack-***@public.gmane.org [mailto:saint.ack-***@public.gmane.org] On Behalf Of Stack
Sent: Sunday, April 10, 2011 1:16 PM
To: user-50Pas4EWwPEyzMRdD/***@public.gmane.org
Cc: Andy Sautins
Subject: Re: DFS stability running HBase and dfs.datanode.handler.count...

Did you try upping it Andy? Andrew Purtell's recommendation though old would have come of experience. The Intel article reads like sales but there is probably merit to its suggestion. The Cloudera article is more unsure about the effect of upping handlers though it allows "...could be set a bit higher."

I just looked at our prod frontend and its set to 3 still. I don't see your exceptions in our DN log.

What version of hadoop? You say hbase 0.91. You mean 0.90.1?

ulimit and nproc are set sufficiently high for hadoop/hbase user?

If you grep 163126943925471435_28809750 in namenode log, do you see a delete occur before a later open?

St.Ack
Post by Andy Sautins
2011-04-09 00:12:48,035 ERROR
DatanodeRegistration(10.18.0.33:50010,
storageID=DS-1501576934-10.18.0.33-50010-1296248656454,
infoPort=50075, ipcPort=50020):DataXceiver
java.io.IOException: Block blk_-163126943925471435_28809750 is not valid.
  The above seems to correspond to ClosedChannelExceptions in the hbase regionserver logs as well as some warnings about long write to hlog ( some in the 50+ seconds ).
   The biggest end-user facing issue we are seeing is that Task Trackers keep getting blacklisted.  It's quite possible our problem is unrelated to anything HBase, but I thought it was worth asking given what we've been seeing.
  We are currently running 0.91 on an 18 node cluster with ~3k total regions and each region server is running with 2G of memory.
  Any insight would be appreciated.
  Thanks
   Andy
Jean-Daniel Cryans
2011-04-14 17:20:07 UTC
Permalink
This is probably a red herring, for example if the region server had a
big GC pause then the master could have already split the log and the
region server wouldn't be able to close it (that's our version of IO
fencing). So from that exception look back in the log and see if
there's anything like :

INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have
not heard from server in some_big_number ms

J-D

On Thu, Apr 14, 2011 at 7:24 AM, Andy Sautins
 Thanks for the response stack.  Yes we tried increasing dfs.datanode.handler.count to 8.   At this point I would say it didn't seem to resolve the issue we are seeing, but we it also doesn't seem to be hurting anything so for right now we're going to leave it in at 8 while we continue to debug.
 In regard to the original error I posted ( Block 'x' is not valid ) we have chased that down thanks to your suggestion of looking at the logs for the history of the block.  It _looks_ like our 'is not valid' block errors are unrelated and due to chmod or deleting mapreduce output directories directly after a run.  We are still isolating that but it looks like it's not HBase releated so I'll move that to another list.  Thank you very much for your debugging suggestions.
2011-04-14 06:05:13,001 ERROR org.apache.hadoop.hdfs.DFSClient: Exception closing file /user/hbase/.logs/hd10.dfs.returnpath.net,60020,1302555127291/hd10.dfs.returnpath.net%3A60020.1302781635921 : java.io.IOException: Error Recovery for block blk_1315316969665710488_29842654 failed  because recovery from primary datanode 10.18.0.16:50010 failed 6 times.  Pipeline was 10.18.0.16:50010. Aborting...
java.io.IOException: Error Recovery for block blk_1315316969665710488_29842654 failed  because recovery from primary datanode 10.18.0.16:50010 failed 6 times.  Pipeline was 10.18.0.16:50010. Aborting...
       at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2841)
       at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2305)
Other than the above exception causing a region server to die occasionally everything seems to be working well.
Note we have now upgraded to Cloudera CDH Version 3 Update 0 ( hadoop 0.20.2+923.21 and hbase 0.90.1+15.18 ) and still see the above exception.  We do have ulimit set ( memory unlimited and files 32k ) for the user running hbase.
Thanks again for your help
 Andy
-----Original Message-----
Sent: Sunday, April 10, 2011 1:16 PM
Cc: Andy Sautins
Subject: Re: DFS stability running HBase and dfs.datanode.handler.count...
Did you try upping it Andy?  Andrew Purtell's recommendation though old would have come of experience.  The Intel article reads like sales but there is probably merit to its suggestion.  The Cloudera article is more unsure about the effect of upping handlers though it allows "...could be set a bit higher."
I just looked at our prod frontend and its set to 3 still.  I don't see your exceptions in our DN log.
What version of hadoop?  You say hbase 0.91.  You mean 0.90.1?
ulimit and nproc are set sufficiently high for hadoop/hbase user?
If you grep 163126943925471435_28809750 in namenode log, do you see a delete occur before a later open?
St.Ack
Post by Andy Sautins
2011-04-09 00:12:48,035 ERROR
DatanodeRegistration(10.18.0.33:50010,
storageID=DS-1501576934-10.18.0.33-50010-1296248656454,
infoPort=50075, ipcPort=50020):DataXceiver
java.io.IOException: Block blk_-163126943925471435_28809750 is not valid.
  The above seems to correspond to ClosedChannelExceptions in the hbase regionserver logs as well as some warnings about long write to hlog ( some in the 50+ seconds ).
   The biggest end-user facing issue we are seeing is that Task Trackers keep getting blacklisted.  It's quite possible our problem is unrelated to anything HBase, but I thought it was worth asking given what we've been seeing.
  We are currently running 0.91 on an 18 node cluster with ~3k total regions and each region server is running with 2G of memory.
  Any insight would be appreciated.
  Thanks
   Andy
Ben Aldrich
2011-04-14 17:23:17 UTC
Permalink
Just to chime in here, the other thing we changed was our max_file_size is
now set to 2gb instead of 512mb. This could be causing long compaction
times. If a compaction takes too long it won't respond and can be marked as
dead. I have had this happen on my dev cluster a few times.

-Ben
Post by Jean-Daniel Cryans
This is probably a red herring, for example if the region server had a
big GC pause then the master could have already split the log and the
region server wouldn't be able to close it (that's our version of IO
fencing). So from that exception look back in the log and see if
INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have
not heard from server in some_big_number ms
J-D
On Thu, Apr 14, 2011 at 7:24 AM, Andy Sautins
Post by Andy Sautins
Thanks for the response stack. Yes we tried increasing
dfs.datanode.handler.count to 8. At this point I would say it didn't seem
to resolve the issue we are seeing, but we it also doesn't seem to be
hurting anything so for right now we're going to leave it in at 8 while we
continue to debug.
Post by Andy Sautins
In regard to the original error I posted ( Block 'x' is not valid ) we
have chased that down thanks to your suggestion of looking at the logs for
the history of the block. It _looks_ like our 'is not valid' block errors
are unrelated and due to chmod or deleting mapreduce output directories
directly after a run. We are still isolating that but it looks like it's
not HBase releated so I'll move that to another list. Thank you very much
for your debugging suggestions.
Post by Andy Sautins
The one issue we are still seeing is that we will occasionally have a
regionserver die with the following exception. I need to chase that down a
little more but it seems similar to a post from 2/13/2011 (
not sure was ever resolved or not. If anyone has any insight on how to
debug the following error a little more I would appreciate any thoughts you
might have.
Post by Andy Sautins
2011-04-14 06:05:13,001 ERROR org.apache.hadoop.hdfs.DFSClient: Exception
closing file /user/hbase/.logs/hd10.dfs.returnpath.net
java.io.IOException: Error Recovery for block
blk_1315316969665710488_29842654 failed because recovery from primary
datanode 10.18.0.16:50010 failed 6 times. Pipeline was 10.18.0.16:50010.
Aborting...
Post by Andy Sautins
java.io.IOException: Error Recovery for block
blk_1315316969665710488_29842654 failed because recovery from primary
datanode 10.18.0.16:50010 failed 6 times. Pipeline was 10.18.0.16:50010.
Aborting...
Post by Andy Sautins
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2841)
Post by Andy Sautins
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2305)
Post by Andy Sautins
Other than the above exception causing a region server to die
occasionally everything seems to be working well.
Post by Andy Sautins
Note we have now upgraded to Cloudera CDH Version 3 Update 0 ( hadoop
0.20.2+923.21 and hbase 0.90.1+15.18 ) and still see the above exception.
We do have ulimit set ( memory unlimited and files 32k ) for the user
running hbase.
Post by Andy Sautins
Thanks again for your help
Andy
-----Original Message-----
Stack
Post by Andy Sautins
Sent: Sunday, April 10, 2011 1:16 PM
Cc: Andy Sautins
Subject: Re: DFS stability running HBase and
dfs.datanode.handler.count...
Post by Andy Sautins
Did you try upping it Andy? Andrew Purtell's recommendation though old
would have come of experience. The Intel article reads like sales but there
is probably merit to its suggestion. The Cloudera article is more unsure
about the effect of upping handlers though it allows "...could be set a bit
higher."
Post by Andy Sautins
I just looked at our prod frontend and its set to 3 still. I don't see
your exceptions in our DN log.
Post by Andy Sautins
What version of hadoop? You say hbase 0.91. You mean 0.90.1?
ulimit and nproc are set sufficiently high for hadoop/hbase user?
If you grep 163126943925471435_28809750 in namenode log, do you see a
delete occur before a later open?
Post by Andy Sautins
St.Ack
On Sat, Apr 9, 2011 at 4:35 PM, Andy Sautins <
Post by Andy Sautins
I ran across an mailing list posting from 1/4/2009 that seemed to
indicate increasing dfs.datanode.handler.count could help improve DFS
stability (
anything in the wiki about increasing dfs.datanode.handler.count. I have
seen a few other notes that seem to show examples that have raised
dfs.datanode.handler.count including one from an IBM article (
http://software.intel.com/en-us/articles/hadoop-and-hbase-optimization-for-read-intensive-search-applications/) and the Pro Hadoop book, but other than that the only other mention I see
is from cloudera seems luke-warm on increasing dfs.datanode.handler.count (
http://www.cloudera.com/blog/2009/03/configuration-parameters-what-can-you-just-ignore/).
Post by Andy Sautins
Post by Andy Sautins
Given the post is from 2009 I thought I'd ask if anyone has had any
success improving stability of HBase/DFS when increasing
dfs.datanode.handler.count. The specific error we are seeing somewhat
frequently ( few hundred times per day ) in the datanode longs is as
Post by Andy Sautins
Post by Andy Sautins
2011-04-09 00:12:48,035 ERROR
DatanodeRegistration(10.18.0.33:50010,
storageID=DS-1501576934-10.18.0.33-50010-1296248656454,
infoPort=50075, ipcPort=50020):DataXceiver
java.io.IOException: Block blk_-163126943925471435_28809750 is not
valid.
Post by Andy Sautins
Post by Andy Sautins
The above seems to correspond to ClosedChannelExceptions in the hbase
regionserver logs as well as some warnings about long write to hlog ( some
in the 50+ seconds ).
Post by Andy Sautins
Post by Andy Sautins
The biggest end-user facing issue we are seeing is that Task Trackers
keep getting blacklisted. It's quite possible our problem is unrelated to
anything HBase, but I thought it was worth asking given what we've been
seeing.
Post by Andy Sautins
Post by Andy Sautins
We are currently running 0.91 on an 18 node cluster with ~3k total
regions and each region server is running with 2G of memory.
Post by Andy Sautins
Post by Andy Sautins
Any insight would be appreciated.
Thanks
Andy
Jean-Daniel Cryans
2011-04-14 17:26:21 UTC
Permalink
Ben, the compaction is done in a background thread, it doesn't block
anything. Now if you had a heap close to 2GB, you could easily run
into issues.

J-D
Post by Ben Aldrich
Just to chime in here, the other thing we changed was our max_file_size is
now set to 2gb instead of 512mb. This could be causing long compaction
times. If a compaction takes too long it won't respond and can be marked as
dead. I have had this happen on my dev cluster a few times.
-Ben
Post by Jean-Daniel Cryans
This is probably a red herring, for example if the region server had a
big GC pause then the master could have already split the log and the
region server wouldn't be able to close it (that's our version of IO
fencing). So from that exception look back in the log and see if
INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have
not heard from server in some_big_number ms
J-D
On Thu, Apr 14, 2011 at 7:24 AM, Andy Sautins
 Thanks for the response stack.  Yes we tried increasing
dfs.datanode.handler.count to 8.   At this point I would say it didn't seem
to resolve the issue we are seeing, but we it also doesn't seem to be
hurting anything so for right now we're going to leave it in at 8 while we
continue to debug.
 In regard to the original error I posted ( Block 'x' is not valid ) we
have chased that down thanks to your suggestion of looking at the logs for
the history of the block.  It _looks_ like our 'is not valid' block errors
are unrelated and due to chmod or deleting mapreduce output directories
directly after a run.  We are still isolating that but it looks like it's
not HBase releated so I'll move that to another list.  Thank you very much
for your debugging suggestions.
  The one issue we are still seeing is that we will occasionally have a
regionserver die with the following exception.  I need to chase that down a
little more but it seems similar to a post from 2/13/2011 (
not sure was ever resolved or not.  If anyone has any insight on how to
debug the following error a little more I would appreciate any thoughts you
might have.
2011-04-14 06:05:13,001 ERROR org.apache.hadoop.hdfs.DFSClient: Exception
closing file /user/hbase/.logs/hd10.dfs.returnpath.net
java.io.IOException: Error Recovery for block
blk_1315316969665710488_29842654 failed  because recovery from primary
datanode 10.18.0.16:50010 failed 6 times.  Pipeline was 10.18.0.16:50010.
Aborting...
java.io.IOException: Error Recovery for block
blk_1315316969665710488_29842654 failed  because recovery from primary
datanode 10.18.0.16:50010 failed 6 times.  Pipeline was 10.18.0.16:50010.
Aborting...
       at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2841)
       at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2305)
Other than the above exception causing a region server to die
occasionally everything seems to be working well.
Note we have now upgraded to Cloudera CDH Version 3 Update 0 ( hadoop
0.20.2+923.21 and hbase 0.90.1+15.18 ) and still see the above exception.
 We do have ulimit set ( memory unlimited and files 32k ) for the user
running hbase.
Thanks again for your help
 Andy
-----Original Message-----
Stack
Sent: Sunday, April 10, 2011 1:16 PM
Cc: Andy Sautins
Subject: Re: DFS stability running HBase and
dfs.datanode.handler.count...
Did you try upping it Andy?  Andrew Purtell's recommendation though old
would have come of experience.  The Intel article reads like sales but there
is probably merit to its suggestion.  The Cloudera article is more unsure
about the effect of upping handlers though it allows "...could be set a bit
higher."
I just looked at our prod frontend and its set to 3 still.  I don't see
your exceptions in our DN log.
What version of hadoop?  You say hbase 0.91.  You mean 0.90.1?
ulimit and nproc are set sufficiently high for hadoop/hbase user?
If you grep 163126943925471435_28809750 in namenode log, do you see a
delete occur before a later open?
St.Ack
On Sat, Apr 9, 2011 at 4:35 PM, Andy Sautins <
   I ran across an mailing list posting from 1/4/2009 that seemed to
indicate increasing dfs.datanode.handler.count could help improve DFS
stability (
anything in the wiki about increasing dfs.datanode.handler.count.   I have
seen a few other notes that seem to show examples that have raised
dfs.datanode.handler.count including one from an IBM article (
http://software.intel.com/en-us/articles/hadoop-and-hbase-optimization-for-read-intensive-search-applications/) and the Pro Hadoop book, but other than that the only other mention I see
is from cloudera seems luke-warm on increasing dfs.datanode.handler.count (
http://www.cloudera.com/blog/2009/03/configuration-parameters-what-can-you-just-ignore/).
   Given the post is from 2009 I thought I'd ask if anyone has had any
success improving stability of HBase/DFS when increasing
dfs.datanode.handler.count.  The specific error we are seeing somewhat
 frequently ( few hundred times per day ) in the datanode longs is as
2011-04-09 00:12:48,035 ERROR
DatanodeRegistration(10.18.0.33:50010,
storageID=DS-1501576934-10.18.0.33-50010-1296248656454,
infoPort=50075, ipcPort=50020):DataXceiver
java.io.IOException: Block blk_-163126943925471435_28809750 is not
valid.
  The above seems to correspond to ClosedChannelExceptions in the hbase
regionserver logs as well as some warnings about long write to hlog ( some
in the 50+ seconds ).
   The biggest end-user facing issue we are seeing is that Task Trackers
keep getting blacklisted.  It's quite possible our problem is unrelated to
anything HBase, but I thought it was worth asking given what we've been
seeing.
  We are currently running 0.91 on an 18 node cluster with ~3k total
regions and each region server is running with 2G of memory.
  Any insight would be appreciated.
  Thanks
   Andy
Ben Aldrich
2011-04-14 17:31:57 UTC
Permalink
Our heapsize is set to 2gb, I think my dev issue was because I was running
things off of a few vm's. Even though the compaction is in another thread it
would still fail to respond during major compaction.

-Ben
Post by Jean-Daniel Cryans
Ben, the compaction is done in a background thread, it doesn't block
anything. Now if you had a heap close to 2GB, you could easily run
into issues.
J-D
Post by Ben Aldrich
Just to chime in here, the other thing we changed was our max_file_size
is
Post by Ben Aldrich
now set to 2gb instead of 512mb. This could be causing long compaction
times. If a compaction takes too long it won't respond and can be marked
as
Post by Ben Aldrich
dead. I have had this happen on my dev cluster a few times.
-Ben
On Thu, Apr 14, 2011 at 11:20 AM, Jean-Daniel Cryans <
Post by Jean-Daniel Cryans
This is probably a red herring, for example if the region server had a
big GC pause then the master could have already split the log and the
region server wouldn't be able to close it (that's our version of IO
fencing). So from that exception look back in the log and see if
INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have
not heard from server in some_big_number ms
J-D
On Thu, Apr 14, 2011 at 7:24 AM, Andy Sautins
Post by Andy Sautins
Thanks for the response stack. Yes we tried increasing
dfs.datanode.handler.count to 8. At this point I would say it didn't
seem
Post by Ben Aldrich
Post by Jean-Daniel Cryans
to resolve the issue we are seeing, but we it also doesn't seem to be
hurting anything so for right now we're going to leave it in at 8 while
we
Post by Ben Aldrich
Post by Jean-Daniel Cryans
continue to debug.
Post by Andy Sautins
In regard to the original error I posted ( Block 'x' is not valid )
we
Post by Ben Aldrich
Post by Jean-Daniel Cryans
have chased that down thanks to your suggestion of looking at the logs
for
Post by Ben Aldrich
Post by Jean-Daniel Cryans
the history of the block. It _looks_ like our 'is not valid' block
errors
Post by Ben Aldrich
Post by Jean-Daniel Cryans
are unrelated and due to chmod or deleting mapreduce output directories
directly after a run. We are still isolating that but it looks like
it's
Post by Ben Aldrich
Post by Jean-Daniel Cryans
not HBase releated so I'll move that to another list. Thank you very
much
Post by Ben Aldrich
Post by Jean-Daniel Cryans
for your debugging suggestions.
Post by Andy Sautins
The one issue we are still seeing is that we will occasionally have
a
Post by Ben Aldrich
Post by Jean-Daniel Cryans
regionserver die with the following exception. I need to chase that
down a
Post by Ben Aldrich
Post by Jean-Daniel Cryans
little more but it seems similar to a post from 2/13/2011 (
I'm
Post by Ben Aldrich
Post by Jean-Daniel Cryans
not sure was ever resolved or not. If anyone has any insight on how to
debug the following error a little more I would appreciate any thoughts
you
Post by Ben Aldrich
Post by Jean-Daniel Cryans
might have.
Exception
Post by Ben Aldrich
Post by Jean-Daniel Cryans
closing file /user/hbase/.logs/hd10.dfs.returnpath.net
java.io.IOException: Error Recovery for block
blk_1315316969665710488_29842654 failed because recovery from primary
datanode 10.18.0.16:50010 failed 6 times. Pipeline was
10.18.0.16:50010.
Post by Ben Aldrich
Post by Jean-Daniel Cryans
Aborting...
Post by Andy Sautins
java.io.IOException: Error Recovery for block
blk_1315316969665710488_29842654 failed because recovery from primary
datanode 10.18.0.16:50010 failed 6 times. Pipeline was
10.18.0.16:50010.
Post by Ben Aldrich
Post by Jean-Daniel Cryans
Aborting...
Post by Andy Sautins
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2841)
Post by Ben Aldrich
Post by Jean-Daniel Cryans
Post by Andy Sautins
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2305)
Post by Ben Aldrich
Post by Jean-Daniel Cryans
Post by Andy Sautins
Other than the above exception causing a region server to die
occasionally everything seems to be working well.
Post by Andy Sautins
Note we have now upgraded to Cloudera CDH Version 3 Update 0 ( hadoop
0.20.2+923.21 and hbase 0.90.1+15.18 ) and still see the above
exception.
Post by Ben Aldrich
Post by Jean-Daniel Cryans
We do have ulimit set ( memory unlimited and files 32k ) for the user
running hbase.
Post by Andy Sautins
Thanks again for your help
Andy
-----Original Message-----
Stack
Post by Andy Sautins
Sent: Sunday, April 10, 2011 1:16 PM
Cc: Andy Sautins
Subject: Re: DFS stability running HBase and
dfs.datanode.handler.count...
Post by Andy Sautins
Did you try upping it Andy? Andrew Purtell's recommendation though
old
Post by Ben Aldrich
Post by Jean-Daniel Cryans
would have come of experience. The Intel article reads like sales but
there
Post by Ben Aldrich
Post by Jean-Daniel Cryans
is probably merit to its suggestion. The Cloudera article is more
unsure
Post by Ben Aldrich
Post by Jean-Daniel Cryans
about the effect of upping handlers though it allows "...could be set a
bit
Post by Ben Aldrich
Post by Jean-Daniel Cryans
higher."
Post by Andy Sautins
I just looked at our prod frontend and its set to 3 still. I don't
see
Post by Ben Aldrich
Post by Jean-Daniel Cryans
your exceptions in our DN log.
Post by Andy Sautins
What version of hadoop? You say hbase 0.91. You mean 0.90.1?
ulimit and nproc are set sufficiently high for hadoop/hbase user?
If you grep 163126943925471435_28809750 in namenode log, do you see a
delete occur before a later open?
Post by Andy Sautins
St.Ack
On Sat, Apr 9, 2011 at 4:35 PM, Andy Sautins <
Post by Andy Sautins
I ran across an mailing list posting from 1/4/2009 that seemed to
indicate increasing dfs.datanode.handler.count could help improve DFS
stability (
The posting seems to indicate the wiki was updated, but I don't seen
Post by Ben Aldrich
Post by Jean-Daniel Cryans
anything in the wiki about increasing dfs.datanode.handler.count. I
have
Post by Ben Aldrich
Post by Jean-Daniel Cryans
seen a few other notes that seem to show examples that have raised
dfs.datanode.handler.count including one from an IBM article (
http://software.intel.com/en-us/articles/hadoop-and-hbase-optimization-for-read-intensive-search-applications/)
and the Pro Hadoop book, but other than that the only other mention I see
Post by Ben Aldrich
Post by Jean-Daniel Cryans
is from cloudera seems luke-warm on increasing
dfs.datanode.handler.count (
http://www.cloudera.com/blog/2009/03/configuration-parameters-what-can-you-just-ignore/
).
Post by Ben Aldrich
Post by Jean-Daniel Cryans
Post by Andy Sautins
Post by Andy Sautins
Given the post is from 2009 I thought I'd ask if anyone has had
any
Post by Ben Aldrich
Post by Jean-Daniel Cryans
success improving stability of HBase/DFS when increasing
dfs.datanode.handler.count. The specific error we are seeing somewhat
frequently ( few hundred times per day ) in the datanode longs is as
Post by Andy Sautins
Post by Andy Sautins
2011-04-09 00:12:48,035 ERROR
DatanodeRegistration(10.18.0.33:50010,
storageID=DS-1501576934-10.18.0.33-50010-1296248656454,
infoPort=50075, ipcPort=50020):DataXceiver
java.io.IOException: Block blk_-163126943925471435_28809750 is not
valid.
Post by Andy Sautins
Post by Andy Sautins
The above seems to correspond to ClosedChannelExceptions in the
hbase
Post by Ben Aldrich
Post by Jean-Daniel Cryans
regionserver logs as well as some warnings about long write to hlog (
some
Post by Ben Aldrich
Post by Jean-Daniel Cryans
in the 50+ seconds ).
Post by Andy Sautins
Post by Andy Sautins
The biggest end-user facing issue we are seeing is that Task
Trackers
Post by Ben Aldrich
Post by Jean-Daniel Cryans
keep getting blacklisted. It's quite possible our problem is unrelated
to
Post by Ben Aldrich
Post by Jean-Daniel Cryans
anything HBase, but I thought it was worth asking given what we've been
seeing.
Post by Andy Sautins
Post by Andy Sautins
We are currently running 0.91 on an 18 node cluster with ~3k total
regions and each region server is running with 2G of memory.
Post by Andy Sautins
Post by Andy Sautins
Any insight would be appreciated.
Thanks
Andy
Jean-Daniel Cryans
2011-04-14 17:34:53 UTC
Permalink
Then this would be a heap limitation problem, have a look at your GC
log during the compaction.

J-D
Post by Ben Aldrich
Our heapsize is set to 2gb, I think my dev issue was because I was running
things off of a few vm's. Even though the compaction is in another thread it
would still fail to respond during major compaction.
-Ben
Post by Jean-Daniel Cryans
Ben, the compaction is done in a background thread, it doesn't block
anything. Now if you had a heap close to 2GB, you could easily run
into issues.
J-D
Post by Ben Aldrich
Just to chime in here, the other thing we changed was our max_file_size
is
Post by Ben Aldrich
now set to 2gb instead of 512mb. This could be causing long compaction
times. If a compaction takes too long it won't respond and can be marked
as
Post by Ben Aldrich
dead. I have had this happen on my dev cluster a few times.
-Ben
On Thu, Apr 14, 2011 at 11:20 AM, Jean-Daniel Cryans <
Post by Jean-Daniel Cryans
This is probably a red herring, for example if the region server had a
big GC pause then the master could have already split the log and the
region server wouldn't be able to close it (that's our version of IO
fencing). So from that exception look back in the log and see if
INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have
not heard from server in some_big_number ms
J-D
On Thu, Apr 14, 2011 at 7:24 AM, Andy Sautins
 Thanks for the response stack.  Yes we tried increasing
dfs.datanode.handler.count to 8.   At this point I would say it didn't
seem
Post by Ben Aldrich
Post by Jean-Daniel Cryans
to resolve the issue we are seeing, but we it also doesn't seem to be
hurting anything so for right now we're going to leave it in at 8 while
we
Post by Ben Aldrich
Post by Jean-Daniel Cryans
continue to debug.
 In regard to the original error I posted ( Block 'x' is not valid )
we
Post by Ben Aldrich
Post by Jean-Daniel Cryans
have chased that down thanks to your suggestion of looking at the logs
for
Post by Ben Aldrich
Post by Jean-Daniel Cryans
the history of the block.  It _looks_ like our 'is not valid' block
errors
Post by Ben Aldrich
Post by Jean-Daniel Cryans
are unrelated and due to chmod or deleting mapreduce output directories
directly after a run.  We are still isolating that but it looks like
it's
Post by Ben Aldrich
Post by Jean-Daniel Cryans
not HBase releated so I'll move that to another list.  Thank you very
much
Post by Ben Aldrich
Post by Jean-Daniel Cryans
for your debugging suggestions.
  The one issue we are still seeing is that we will occasionally have
a
Post by Ben Aldrich
Post by Jean-Daniel Cryans
regionserver die with the following exception.  I need to chase that
down a
Post by Ben Aldrich
Post by Jean-Daniel Cryans
little more but it seems similar to a post from 2/13/2011 (
I'm
Post by Ben Aldrich
Post by Jean-Daniel Cryans
not sure was ever resolved or not.  If anyone has any insight on how to
debug the following error a little more I would appreciate any thoughts
you
Post by Ben Aldrich
Post by Jean-Daniel Cryans
might have.
Exception
Post by Ben Aldrich
Post by Jean-Daniel Cryans
closing file /user/hbase/.logs/hd10.dfs.returnpath.net
java.io.IOException: Error Recovery for block
blk_1315316969665710488_29842654 failed  because recovery from primary
datanode 10.18.0.16:50010 failed 6 times.  Pipeline was
10.18.0.16:50010.
Post by Ben Aldrich
Post by Jean-Daniel Cryans
Aborting...
java.io.IOException: Error Recovery for block
blk_1315316969665710488_29842654 failed  because recovery from primary
datanode 10.18.0.16:50010 failed 6 times.  Pipeline was
10.18.0.16:50010.
Post by Ben Aldrich
Post by Jean-Daniel Cryans
Aborting...
       at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2841)
Post by Ben Aldrich
Post by Jean-Daniel Cryans
       at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2305)
Post by Ben Aldrich
Post by Jean-Daniel Cryans
Other than the above exception causing a region server to die
occasionally everything seems to be working well.
Note we have now upgraded to Cloudera CDH Version 3 Update 0 ( hadoop
0.20.2+923.21 and hbase 0.90.1+15.18 ) and still see the above
exception.
Post by Ben Aldrich
Post by Jean-Daniel Cryans
 We do have ulimit set ( memory unlimited and files 32k ) for the user
running hbase.
Thanks again for your help
 Andy
-----Original Message-----
Stack
Sent: Sunday, April 10, 2011 1:16 PM
Cc: Andy Sautins
Subject: Re: DFS stability running HBase and
dfs.datanode.handler.count...
Did you try upping it Andy?  Andrew Purtell's recommendation though
old
Post by Ben Aldrich
Post by Jean-Daniel Cryans
would have come of experience.  The Intel article reads like sales but
there
Post by Ben Aldrich
Post by Jean-Daniel Cryans
is probably merit to its suggestion.  The Cloudera article is more
unsure
Post by Ben Aldrich
Post by Jean-Daniel Cryans
about the effect of upping handlers though it allows "...could be set a
bit
Post by Ben Aldrich
Post by Jean-Daniel Cryans
higher."
I just looked at our prod frontend and its set to 3 still.  I don't
see
Post by Ben Aldrich
Post by Jean-Daniel Cryans
your exceptions in our DN log.
What version of hadoop?  You say hbase 0.91.  You mean 0.90.1?
ulimit and nproc are set sufficiently high for hadoop/hbase user?
If you grep 163126943925471435_28809750 in namenode log, do you see a
delete occur before a later open?
St.Ack
On Sat, Apr 9, 2011 at 4:35 PM, Andy Sautins <
   I ran across an mailing list posting from 1/4/2009 that seemed to
indicate increasing dfs.datanode.handler.count could help improve DFS
stability (
 The posting seems to indicate the wiki was updated, but I don't seen
Post by Ben Aldrich
Post by Jean-Daniel Cryans
anything in the wiki about increasing dfs.datanode.handler.count.   I
have
Post by Ben Aldrich
Post by Jean-Daniel Cryans
seen a few other notes that seem to show examples that have raised
dfs.datanode.handler.count including one from an IBM article (
http://software.intel.com/en-us/articles/hadoop-and-hbase-optimization-for-read-intensive-search-applications/)
and the Pro Hadoop book, but other than that the only other mention I see
Post by Ben Aldrich
Post by Jean-Daniel Cryans
is from cloudera seems luke-warm on increasing
dfs.datanode.handler.count (
http://www.cloudera.com/blog/2009/03/configuration-parameters-what-can-you-just-ignore/
).
Post by Ben Aldrich
Post by Jean-Daniel Cryans
   Given the post is from 2009 I thought I'd ask if anyone has had
any
Post by Ben Aldrich
Post by Jean-Daniel Cryans
success improving stability of HBase/DFS when increasing
dfs.datanode.handler.count.  The specific error we are seeing somewhat
 frequently ( few hundred times per day ) in the datanode longs is as
2011-04-09 00:12:48,035 ERROR
DatanodeRegistration(10.18.0.33:50010,
storageID=DS-1501576934-10.18.0.33-50010-1296248656454,
infoPort=50075, ipcPort=50020):DataXceiver
java.io.IOException: Block blk_-163126943925471435_28809750 is not
valid.
  The above seems to correspond to ClosedChannelExceptions in the
hbase
Post by Ben Aldrich
Post by Jean-Daniel Cryans
regionserver logs as well as some warnings about long write to hlog (
some
Post by Ben Aldrich
Post by Jean-Daniel Cryans
in the 50+ seconds ).
   The biggest end-user facing issue we are seeing is that Task
Trackers
Post by Ben Aldrich
Post by Jean-Daniel Cryans
keep getting blacklisted.  It's quite possible our problem is unrelated
to
Post by Ben Aldrich
Post by Jean-Daniel Cryans
anything HBase, but I thought it was worth asking given what we've been
seeing.
  We are currently running 0.91 on an 18 node cluster with ~3k total
regions and each region server is running with 2G of memory.
  Any insight would be appreciated.
  Thanks
   Andy
Andy Sautins
2011-04-18 16:33:15 UTC
Permalink
J-D, thanks for the response. I suspect you are right that the error message was a red herring. However, I don't see "Client session timed out" in any of our logs when the region server dies so I'm not sure if it is the situation you are thinking of. Note that we are going to enable GC logging to get an idea of how long we pause in GC in our regionservers. We also set hbase.hregion.memstore.mslab.enabled to true in hbase-site.xml which seemed to have helped, although I don't have too many datapoints at this point to confirm. Before we enabled mslab we had region servers die every day or two. After enabling mslab we just went 4 days without a regionserver dying which was better.

Following down our latest Region server that died, it seems to die because it can't close the hlog which sounds like the situation you mentioned in your response, although I'm not sure I fully understand IO Fencing. The stack traces are below.

Are there any configuration parameters that could help keep the regionserver from dying when it encounters this situation? We generally just restart the regionserver and it recovers nicely, but it would be ideal to not need to login and re-start regionservers due to this situation.

At this point it sounds like my best bet is to try to understand if we are still seeing big GC pauses even with mslab enabled to identify if that is our problem.

Thanks for your help

Andy



2011-04-18 02:04:12,862 FATAL org.apache.hadoop.hbase.regionserver.wal.HLog: Could not append. Requesting close of hlog
java.io.IOException: Error Recovery for block blk_-4878958038737199123_30707596 failed because recovery from primary datanode 10.18.0.28:50010 failed 6 times. Pipeline was 10.18.0.28:50010. Aborting...
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2841)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2305)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2477)
2011-04-18 02:04:12,862 FATAL org.apache.hadoop.hbase.regionserver.wal.HLog: Could not append. Requesting close of hlog
java.io.IOException: Error Recovery for block blk_-4878958038737199123_30707596 failed because recovery from primary datanode 10.18.0.28:50010 failed 6 times. Pipeline was 10.18.0.28:50010. Aborting...

The above exception seems to correlate to the following error on the datanode.

2011-04-18 02:04:08,773 WARN org.apache.hadoop.hdfs.server.protocol.InterDatanodeProtocol: Failed to getBlockMetaDataInfo for block (=blk_-4878958038737199123_30707596) from datanode (=10.18.0.28:50010)
java.io.IOException: Block blk_-4878958038737199123_30711904 length is 45855569 does not match block file length 45853290
at org.apache.hadoop.hdfs.server.datanode.FSDataset.validateBlockMetadata(FSDataset.java:1661)
at org.apache.hadoop.hdfs.server.datanode.FSDataset.startBlockRecovery(FSDataset.java:1921)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startBlockRecovery(DataNode.java:1658)
at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1772)
at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1930)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1415)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1411)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Unknown Source)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1409)
2011-04-18 02:04:08,774 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 50020, call recoverBlock(blk_-4878958038737199123_30707596, false, [Lorg.apache.hadoop.hdfs.protocol.DatanodeInfo;@2cda6e31) from 10.18.0.28:52651: error: java.io.IOException: All datanodes failed: block=blk_-4878958038737199123_30707596, datanodeids=[10.18.0.28:50010]

And the block does look like it is associated with a log file:

2011-04-18 01:37:17,661 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /user/hbase/.logs/hd28.dfs.returnpath.net,60020,1302812828956/hd28.dfs.returnpath.net%3A60020.1303112237535. blk_-4878958038737199123_30707596




-----Original Message-----
From: jdcryans-***@public.gmane.org [mailto:jdcryans-***@public.gmane.org] On Behalf Of Jean-Daniel Cryans
Sent: Thursday, April 14, 2011 11:20 AM
To: user-50Pas4EWwPEyzMRdD/***@public.gmane.org
Subject: Re: DFS stability running HBase and dfs.datanode.handler.count...

This is probably a red herring, for example if the region server had a
big GC pause then the master could have already split the log and the
region server wouldn't be able to close it (that's our version of IO
fencing). So from that exception look back in the log and see if
there's anything like :

INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have
not heard from server in some_big_number ms

J-D

On Thu, Apr 14, 2011 at 7:24 AM, Andy Sautins
 Thanks for the response stack.  Yes we tried increasing dfs.datanode.handler.count to 8.   At this point I would say it didn't seem to resolve the issue we are seeing, but we it also doesn't seem to be hurting anything so for right now we're going to leave it in at 8 while we continue to debug.
 In regard to the original error I posted ( Block 'x' is not valid ) we have chased that down thanks to your suggestion of looking at the logs for the history of the block.  It _looks_ like our 'is not valid' block errors are unrelated and due to chmod or deleting mapreduce output directories directly after a run.  We are still isolating that but it looks like it's not HBase releated so I'll move that to another list.  Thank you very much for your debugging suggestions.
2011-04-14 06:05:13,001 ERROR org.apache.hadoop.hdfs.DFSClient: Exception closing file /user/hbase/.logs/hd10.dfs.returnpath.net,60020,1302555127291/hd10.dfs.returnpath.net%3A60020.1302781635921 : java.io.IOException: Error Recovery for block blk_1315316969665710488_29842654 failed  because recovery from primary datanode 10.18.0.16:50010 failed 6 times.  Pipeline was 10.18.0.16:50010. Aborting...
java.io.IOException: Error Recovery for block blk_1315316969665710488_29842654 failed  because recovery from primary datanode 10.18.0.16:50010 failed 6 times.  Pipeline was 10.18.0.16:50010. Aborting...
       at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2841)
       at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2305)
Other than the above exception causing a region server to die occasionally everything seems to be working well.
Note we have now upgraded to Cloudera CDH Version 3 Update 0 ( hadoop 0.20.2+923.21 and hbase 0.90.1+15.18 ) and still see the above exception.  We do have ulimit set ( memory unlimited and files 32k ) for the user running hbase.
Thanks again for your help
 Andy
-----Original Message-----
Sent: Sunday, April 10, 2011 1:16 PM
Cc: Andy Sautins
Subject: Re: DFS stability running HBase and dfs.datanode.handler.count...
Did you try upping it Andy?  Andrew Purtell's recommendation though old would have come of experience.  The Intel article reads like sales but there is probably merit to its suggestion.  The Cloudera article is more unsure about the effect of upping handlers though it allows "...could be set a bit higher."
I just looked at our prod frontend and its set to 3 still.  I don't see your exceptions in our DN log.
What version of hadoop?  You say hbase 0.91.  You mean 0.90.1?
ulimit and nproc are set sufficiently high for hadoop/hbase user?
If you grep 163126943925471435_28809750 in namenode log, do you see a delete occur before a later open?
St.Ack
Post by Andy Sautins
2011-04-09 00:12:48,035 ERROR
DatanodeRegistration(10.18.0.33:50010,
storageID=DS-1501576934-10.18.0.33-50010-1296248656454,
infoPort=50075, ipcPort=50020):DataXceiver
java.io.IOException: Block blk_-163126943925471435_28809750 is not valid.
  The above seems to correspond to ClosedChannelExceptions in the hbase regionserver logs as well as some warnings about long write to hlog ( some in the 50+ seconds ).
   The biggest end-user facing issue we are seeing is that Task Trackers keep getting blacklisted.  It's quite possible our problem is unrelated to anything HBase, but I thought it was worth asking given what we've been seeing.
  We are currently running 0.91 on an 18 node cluster with ~3k total regions and each region server is running with 2G of memory.
  Any insight would be appreciated.
  Thanks
   Andy
Jean-Daniel Cryans
2011-04-18 16:44:32 UTC
Permalink
You can confirm if it's really a GC issue by taking a look at the
master log; if the log splitting started before the errors in the
region server log, then something happened to the region server. It's
usually GC, but it could be something else.

The fact that mslab helped does point to a GC issue.

Regarding configurations, you can tweak zookeeper.session.timeout to
set the timeout higher than the default 180000ms (3 minutes). It might
also be worth making sure that your region server and zookeeper agree
on the same timeout when the region server starts, depending on how
you start zookeeper it might end up being different. The line looks
like this:

17:39:20,189 INFO org.apache.zookeeper.server.NIOServerCnxn:
Established session 0xsome_hex with negotiated timeout 180000 for
client blah_address

J-D

On Mon, Apr 18, 2011 at 9:33 AM, Andy Sautins
  J-D, thanks for the response.  I suspect you are right that the error message was a red herring.  However, I don't see "Client session timed out" in any of our logs when the region server dies so I'm not sure if it is the situation you are thinking of.  Note that we are going to enable GC logging to get an idea of how long we pause in GC in our regionservers.  We also set hbase.hregion.memstore.mslab.enabled to true in hbase-site.xml which seemed to have helped, although I don't have too many datapoints at this point to confirm.  Before we enabled mslab we had region servers die every day or two.  After enabling mslab we just went 4 days without a regionserver dying which was better.
  Following down our latest Region server that died, it seems to die because it can't close the hlog which sounds like the situation you mentioned in your response, although I'm not sure I fully understand IO Fencing.  The stack traces are below.
  Are there any configuration parameters that could help keep the regionserver from dying when it encounters this situation?  We generally just restart the regionserver and it recovers nicely, but it would be ideal to not need to login and re-start regionservers due to this situation.
  At this point it sounds like my best bet is to try to understand if we are still seeing big GC pauses even with mslab enabled to identify if that is our problem.
  Thanks for your help
  Andy
2011-04-18 02:04:12,862 FATAL org.apache.hadoop.hbase.regionserver.wal.HLog: Could not append. Requesting close of hlog
java.io.IOException: Error Recovery for block blk_-4878958038737199123_30707596 failed  because recovery from primary datanode 10.18.0.28:50010 failed 6 times.  Pipeline was 10.18.0.28:50010. Aborting...
       at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2841)
       at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2305)
       at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2477)
2011-04-18 02:04:12,862 FATAL org.apache.hadoop.hbase.regionserver.wal.HLog: Could not append. Requesting close of hlog
java.io.IOException: Error Recovery for block blk_-4878958038737199123_30707596 failed  because recovery from primary datanode 10.18.0.28:50010 failed 6 times.  Pipeline was 10.18.0.28:50010. Aborting...
 The above exception seems to correlate to the following error on the datanode.
2011-04-18 02:04:08,773 WARN org.apache.hadoop.hdfs.server.protocol.InterDatanodeProtocol: Failed to getBlockMetaDataInfo for block (=blk_-4878958038737199123_30707596) from datanode (=10.18.0.28:50010)
java.io.IOException: Block blk_-4878958038737199123_30711904 length is 45855569 does not match block file length 45853290
       at org.apache.hadoop.hdfs.server.datanode.FSDataset.validateBlockMetadata(FSDataset.java:1661)
       at org.apache.hadoop.hdfs.server.datanode.FSDataset.startBlockRecovery(FSDataset.java:1921)
       at org.apache.hadoop.hdfs.server.datanode.DataNode.startBlockRecovery(DataNode.java:1658)
       at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1772)
       at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1930)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
       at java.lang.reflect.Method.invoke(Unknown Source)
       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557)
       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1415)
       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1411)
       at java.security.AccessController.doPrivileged(Native Method)
       at javax.security.auth.Subject.doAs(Unknown Source)
       at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1409)
2011-04-18 01:37:17,661 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /user/hbase/.logs/hd28.dfs.returnpath.net,60020,1302812828956/hd28.dfs.returnpath.net%3A60020.1303112237535. blk_-4878958038737199123_30707596
Continue reading on narkive:
Loading...