Discussion:
Region servers up and running, but Master reports 0
(too old to reply)
Matthew LeMieux
2010-08-23 23:22:33 UTC
Permalink
I have a cluster of 3 machines where the NameNode is separate from the HMaster based on the distribution from Cloudera (CDH3). I have been running it successfully for a couple weeks. As of this morning, it is completely unusable. I'm looking for some help on how to fix it. Details are below. Thank you.

This morning I found HBase to be unresponsive, and tried restarting it. That didn't help. For example, running "hbase shell", and then "list" hangs.

The master and region processes start up, but the master does not recognize that the region servers are there. I am getting the following in master's log file:

2010-08-23 23:04:16,100 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:05:16,110 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:06:16,120 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:07:16,130 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:08:16,140 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:09:16,146 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:10:16,150 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:11:16,160 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:12:16,170 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN


Meanwhile, the region servers show this in their log files:

2010-08-23 23:05:21,006 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server zookeeper:2181
2010-08-23 23:05:21,028 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to zookeeper:2181, initiating session
2010-08-23 23:05:21,168 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server zookeeper:2181, sessionid = 0x12aa0cc2520000e, negotiated timeout = 40000
2010-08-23 23:05:21,172 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event, state: SyncConnected, type: None, path: null
2010-08-23 23:05:21,177 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Set watcher on master address ZNode /hbase/master
2010-08-23 23:05:21,421 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode /hbase/master got master:60000
2010-08-23 23:05:21,421 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at master:60000 that we are up
2010-08-23 23:05:22,056 INFO org.apache.hadoop.hbase.regionserver.ShutdownHook: Installed shutdown hook thread: Shutdownhook:regionserver60020

The Region server process is obviously waiting on something:

/tmp/hbaselog$ sudo strace -p7592
Process 7592 attached - interrupt to quit
futex(0x7f65534739e0, FUTEX_WAIT, 7602, NULL

The Master isn't idle, it appears to be trying to do some sort of recovery having woken up to find 0 region servers. Here is an excerpt from it:

2010-08-23 23:10:06,290 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12142 of 143261: hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704435, length=1150
2010-08-23 23:10:06,290 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering filehdfs://namenode:9000/hbase/.log master,60020,1282577331142/master%3A60020.1282581704435
2010-08-23 23:10:06,510 INFO org.apache.hadoop.hbase.util.FSUtils: Finished lease recover attempt for hdfs://namenode:9000/hbase/.logs master,60020,1282577331142/master%3A60020.1282581704435
2010-08-23 23:10:06,513 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Pushed=3 entries from hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/1 master%3A60020.1282581704435
2010-08-23 23:10:06,513 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12143 of 143261: hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451, length=448
2010-08-23 23:10:06,513 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering filehdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451
2010-08-23 23:10:06,721 INFO org.apache.hadoop.hbase.util.FSUtils: Finished lease recover attempt for hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451
2010-08-23 23:10:06,723 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Pushed=2 entries from hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451
2010-08-23 23:10:06,723 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12144 of 143261: hdfs:/namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704468, length=582
2010-08-23 23:10:06,723 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering filehdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704468


It looks like the Master is sequentially going through logs up to 143261, having started at 1 and is currently at 12144. At the current rate, it will take around 12 hours to complete. Do I have to wait for it to complete before the master will recognize the region servers? If it doesn't have any region servers, then what the heck is the master doing anyway?

Thank you for your help,

Matthew
Jean-Daniel Cryans
2010-08-23 23:30:55 UTC
Permalink
Matthew,

Yes, the master won't start until the log splitting is done. By the
looks of it (such a high number of logs when the max is 32 until the
region server force flushes hlogs), it seems that your region servers
in a previous run weren't able to ensure that each HLog had 3 replicas
even tho dfs.replication was set to that number. Check your previous
region server logs, you should see instances of:

LOG.warn("HDFS pipeline error detected. " +
"Found " + numCurrentReplicas + " replicas but expecting " +
this.initialReplication + " replicas. " +
" Requesting close of hlog.");

And that probably made 1 log per insert. If your still testing and
your data isn't important, feel free to kill -9 the master and nuke
the HBase HDFS folder or at least the .logs folder. If not, you'll
have to wait. Once it's done, make sure you have a proper
configuration before any other testing.

J-D
I have a cluster of 3 machines where the NameNode is separate from the HMaster based on the distribution from Cloudera (CDH3).   I have been running it successfully for a couple weeks.   As of this morning, it is completely unusable.  I'm looking for some help on how to fix it.  Details are below.  Thank you.
This morning I found HBase to be unresponsive, and tried restarting it.  That didn't help.  For example, running "hbase shell", and then "list" hangs.
2010-08-23 23:04:16,100 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:05:16,110 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:06:16,120 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:07:16,130 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:08:16,140 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:09:16,146 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:10:16,150 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:11:16,160 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:12:16,170 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:05:21,006 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server zookeeper:2181
2010-08-23 23:05:21,028 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to zookeeper:2181, initiating session
2010-08-23 23:05:21,168 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server zookeeper:2181, sessionid = 0x12aa0cc2520000e, negotiated timeout = 40000
2010-08-23 23:05:21,172 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event, state: SyncConnected, type: None, path: null
2010-08-23 23:05:21,177 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Set watcher on master address ZNode /hbase/master
2010-08-23 23:05:21,421 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode /hbase/master got master:60000
2010-08-23 23:05:21,421 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at master:60000 that we are up
2010-08-23 23:05:22,056 INFO org.apache.hadoop.hbase.regionserver.ShutdownHook: Installed shutdown hook thread: Shutdownhook:regionserver60020
/tmp/hbaselog$ sudo strace -p7592
Process 7592 attached - interrupt to quit
futex(0x7f65534739e0, FUTEX_WAIT, 7602, NULL
2010-08-23 23:10:06,290 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12142 of 143261: hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704435, length=1150
2010-08-23 23:10:06,290 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering filehdfs://namenode:9000/hbase/.log master,60020,1282577331142/master%3A60020.1282581704435
2010-08-23 23:10:06,510 INFO org.apache.hadoop.hbase.util.FSUtils: Finished lease recover attempt for hdfs://namenode:9000/hbase/.logs master,60020,1282577331142/master%3A60020.1282581704435
2010-08-23 23:10:06,513 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Pushed=3 entries from hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/1 master%3A60020.1282581704435
2010-08-23 23:10:06,513 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12143 of 143261: hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451, length=448
2010-08-23 23:10:06,513 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering filehdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451
2010-08-23 23:10:06,721 INFO org.apache.hadoop.hbase.util.FSUtils: Finished lease recover attempt for hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451
2010-08-23 23:10:06,723 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Pushed=2 entries from hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451
2010-08-23 23:10:06,723 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12144 of 143261: hdfs:/namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704468, length=582
2010-08-23 23:10:06,723 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering filehdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704468
It looks like the Master is sequentially going through logs up to 143261, having started at 1 and is currently at 12144.   At the current rate, it will take around 12 hours to complete.  Do I have to wait for it to complete before the master will recognize the region servers?  If it doesn't have any region servers, then what the heck is the master doing anyway?
Thank you for your help,
Matthew
Matthew LeMieux
2010-08-23 23:50:43 UTC
Permalink
J-D,

Thank you for the very fast response. I'm running in the Amazon cloud, and would like to be able to expand the cluster and reduce the cluster quickly to support hadoop jobs, so I've been playing with the replication factor and number of machines.

If I add 5 more machines for a job I'm running over night that needs some extra power, then removing those machines the next morning becomes a tedious task because some of the data generated resides perhaps exclusively on the 5 new machines. Since the HDFS node decommission process takes so long, I've been putting the replication factor high, and removing machines incrementally. For the most part, it works great because the replication speed is much faster than the decommission speed. Hadoop M/R jobs and HDFS seem completely content to deal with nodes coming and going as long as there are at least 1 replica from which to replicate. Since the Hadoop M/R jobs pull data from HBase, do some processing, and then put the data back into HBase I assumed that hosting the Hadoop M/R and HBase on the same HDFS would be faster than copying the data to a new transient cluster.

Is there a way to tell HBase that I want the replication factor to be high, but not to panic (i.e., go crazy with logs) if it isn't actually at that number? Sort of like a soft limit and a hard limit...

Thank you,

Matthew

P.S. I do need the data, I wish there was a way to speed this up. I didn't find any evidence of a crash, just a looooong recovery.
Post by Jean-Daniel Cryans
Matthew,
Yes, the master won't start until the log splitting is done. By the
looks of it (such a high number of logs when the max is 32 until the
region server force flushes hlogs), it seems that your region servers
in a previous run weren't able to ensure that each HLog had 3 replicas
even tho dfs.replication was set to that number. Check your previous
LOG.warn("HDFS pipeline error detected. " +
"Found " + numCurrentReplicas + " replicas but expecting " +
this.initialReplication + " replicas. " +
" Requesting close of hlog.");
And that probably made 1 log per insert. If your still testing and
your data isn't important, feel free to kill -9 the master and nuke
the HBase HDFS folder or at least the .logs folder. If not, you'll
have to wait. Once it's done, make sure you have a proper
configuration before any other testing.
J-D
Post by Matthew LeMieux
I have a cluster of 3 machines where the NameNode is separate from the HMaster based on the distribution from Cloudera (CDH3). I have been running it successfully for a couple weeks. As of this morning, it is completely unusable. I'm looking for some help on how to fix it. Details are below. Thank you.
This morning I found HBase to be unresponsive, and tried restarting it. That didn't help. For example, running "hbase shell", and then "list" hangs.
2010-08-23 23:04:16,100 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:05:16,110 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:06:16,120 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:07:16,130 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:08:16,140 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:09:16,146 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:10:16,150 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:11:16,160 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:12:16,170 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:05:21,006 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server zookeeper:2181
2010-08-23 23:05:21,028 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to zookeeper:2181, initiating session
2010-08-23 23:05:21,168 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server zookeeper:2181, sessionid = 0x12aa0cc2520000e, negotiated timeout = 40000
2010-08-23 23:05:21,172 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event, state: SyncConnected, type: None, path: null
2010-08-23 23:05:21,177 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Set watcher on master address ZNode /hbase/master
2010-08-23 23:05:21,421 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode /hbase/master got master:60000
2010-08-23 23:05:21,421 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at master:60000 that we are up
2010-08-23 23:05:22,056 INFO org.apache.hadoop.hbase.regionserver.ShutdownHook: Installed shutdown hook thread: Shutdownhook:regionserver60020
/tmp/hbaselog$ sudo strace -p7592
Process 7592 attached - interrupt to quit
futex(0x7f65534739e0, FUTEX_WAIT, 7602, NULL
2010-08-23 23:10:06,290 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12142 of 143261: hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704435, length=1150
2010-08-23 23:10:06,290 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering filehdfs://namenode:9000/hbase/.log master,60020,1282577331142/master%3A60020.1282581704435
2010-08-23 23:10:06,510 INFO org.apache.hadoop.hbase.util.FSUtils: Finished lease recover attempt for hdfs://namenode:9000/hbase/.logs master,60020,1282577331142/master%3A60020.1282581704435
2010-08-23 23:10:06,513 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Pushed=3 entries from hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/1 master%3A60020.1282581704435
2010-08-23 23:10:06,513 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12143 of 143261: hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451, length=448
2010-08-23 23:10:06,513 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering filehdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451
2010-08-23 23:10:06,721 INFO org.apache.hadoop.hbase.util.FSUtils: Finished lease recover attempt for hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451
2010-08-23 23:10:06,723 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Pushed=2 entries from hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451
2010-08-23 23:10:06,723 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12144 of 143261: hdfs:/namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704468, length=582
2010-08-23 23:10:06,723 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering filehdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704468
It looks like the Master is sequentially going through logs up to 143261, having started at 1 and is currently at 12144. At the current rate, it will take around 12 hours to complete. Do I have to wait for it to complete before the master will recognize the region servers? If it doesn't have any region servers, then what the heck is the master doing anyway?
Thank you for your help,
Matthew
Daniel Einspanjer
2010-08-24 03:30:25 UTC
Permalink
Matthew,

Maybe instead of changing the replication factor, you could spin up new nodes with a different datacenter/rack configuration which would cause hadoop to ensure the replicas are not solely on those temp nodes?

Matthew LeMieux <mdl-***@public.gmane.org> wrote:

J-D,

Thank you for the very fast response. I'm running in the Amazon cloud, and would like to be able to expand the cluster and reduce the cluster quickly to support hadoop jobs, so I've been playing with the replication factor and number of machines.

If I add 5 more machines for a job I'm running over night that needs some extra power, then removing those machines the next morning becomes a tedious task because some of the data generated resides perhaps exclusively on the 5 new machines. Since the HDFS node decommission process takes so long, I've been putting the replication factor high, and removing machines incrementally. For the most part, it works great because the replication speed is much faster than the decommission speed. Hadoop M/R jobs and HDFS seem completely content to deal with nodes coming and going as long as there are at least 1 replica from which to replicate. Since the Hadoop M/R jobs pull data from HBase, do some processing, and then put the data back into HBase I assumed that hosting the Hadoop M/R and
HBase on the same HDFS would be faster than copying the data to a new transient cluster.

Is there a way to tell HBase that I want the replication factor to be high, but not to panic (i.e., go crazy with logs) if it isn't actually at that number? Sort of like a soft limit and a hard limit...

Thank you,

Matthew

P.S. I do need the data, I wish there was a way to speed this up. I didn't find any evidence of a crash, just a looooong recovery.
Post by Jean-Daniel Cryans
Matthew,
Yes, the master won't start until the log splitting is done. By the
looks of it (such a high number of logs when the max is 32 until the
region server force flushes hlogs), it seems that your region servers
in a previous run weren't able to ensure that each HLog had 3 replicas
even tho dfs.replication was set to that number. Check your previous
LOG.warn("HDFS pipeline error detected. " +
"Found " + numCurrentReplicas + " replicas but expecting " +
this.initialReplication + " replicas. " +
" Requesting close of hlog.");
And that probably made 1 log per insert. If your still testing and
your data isn't important, feel free to kill -9 the master and nuke
the HBase HDFS folder or at least the .logs folder. If not, you'll
have to wait. Once it's done, make sure you have a proper
configuration before any other testing.
J-D
Post by Matthew LeMieux
I have a cluster of 3 machines where the NameNode is separate from the HMaster based on the distribution from Cloudera (CDH3). I have been running it successfully for a couple weeks. As of this morning, it is completely unusable. I'm looking for some help on how to fix it. Details are below. Thank you.
This morning I found HBase to be unresponsive, and tried restarting it. That didn't help. For example, running "hbase shell", and then "list" hangs.
2010-08-23 23:04:16,100 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:05:16,110 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:06:16,120 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:07:16,130 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:08:16,140 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:09:16,146 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:10:16,150 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:11:16,160 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:12:16,170 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:05:21,006 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server zookeeper:2181
2010-08-23 23:05:21,028 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to zookeeper:2181, initiating session
2010-08-23 23:05:21,168 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server zookeeper:2181, sessionid = 0x12aa0cc2520000e, negotiated timeout = 40000
2010-08-23 23:05:21,172 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event, state: SyncConnected, type: None, path: null
2010-08-23 23:05:21,177 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Set watcher on master address ZNode /hbase/master
2010-08-23 23:05:21,421 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode /hbase/master got master:60000
2010-08-23 23:05:21,421 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at master:60000 that we are up
2010-08-23 23:05:22,056 INFO org.apache.hadoop.hbase.regionserver.ShutdownHook: Installed shutdown hook thread: Shutdownhook:regionserver60020
/tmp/hbaselog$ sudo strace -p7592
Process 7592 attached - interrupt to quit
futex(0x7f65534739e0, FUTEX_WAIT, 7602, NULL
2010-08-23 23:10:06,290 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12142 of 143261: hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704435, length=1150
2010-08-23 23:10:06,290 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering filehdfs://namenode:9000/hbase/.log master,60020,1282577331142/master%3A60020.1282581704435
2010-08-23 23:10:06,510 INFO org.apache.hadoop.hbase.util.FSUtils: Finished lease recover attempt for hdfs://namenode:9000/hbase/.logs master,60020,1282577331142/master%3A60020.1282581704435
2010-08-23 23:10:06,513 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Pushed=3 entries from hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/1 master%3A60020.1282581704435
2010-08-23 23:10:06,513 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12143 of 143261: hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451, length=448
2010-08-23 23:10:06,513 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering filehdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451
2010-08-23 23:10:06,721 INFO org.apache.hadoop.hbase.util.FSUtils: Finished lease recover attempt for hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451
2010-08-23 23:10:06,723 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Pushed=2 entries from hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451
2010-08-23 23:10:06,723 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12144 of 143261: hdfs:/namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704468, length=582
2010-08-23 23:10:06,723 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering filehdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704468
It looks like the Master is sequentially going through logs up to 143261, having started at 1 and is currently at 12144. At the current rate, it will take around 12 hours to complete. Do I have to wait for it to complete before the master will recognize the region servers? If it doesn't have any region servers, then what the heck is the master doing anyway?
Thank you for your help,
Matthew
Matthew LeMieux
2010-08-24 14:25:57 UTC
Permalink
That is a very good point. I had only been considering the data locality features of rack configuration, not the fault tolerance aspect. I'll give that a try. Thank you.

-Matthew
Post by Jean-Daniel Cryans
Matthew,
Maybe instead of changing the replication factor, you could spin up new nodes with a different datacenter/rack configuration which would cause hadoop to ensure the replicas are not solely on those temp nodes?
J-D,
Thank you for the very fast response. I'm running in the Amazon cloud, and would like to be able to expand the cluster and reduce the cluster quickly to support hadoop jobs, so I've been playing with the replication factor and number of machines.
If I add 5 more machines for a job I'm running over night that needs some extra power, then removing those machines the next morning becomes a tedious task because some of the data generated resides perhaps exclusively on the 5 new machines. Since the HDFS node decommission process takes so long, I've been putting the replication factor high, and removing machines incrementally. For the most part, it works great because the replication speed is much faster than the decommission speed. Hadoop M/R jobs and HDFS seem completely content to deal with nodes coming and going as long as there are at least 1 replica from which to replicate. Since the Hadoop M/R jobs pull data from HBase, do some processing, and then put the data back into HBase I assumed that hosting the Hadoop M/R and HBase on the same HDFS would be faster than copying the data to a new transient cluster.
Is there a way to tell HBase that I want the replication factor to be high, but not to panic (i.e., go crazy with logs) if it isn't actually at that number? Sort of like a soft limit and a hard limit...
Thank you,
Matthew
P.S. I do need the data, I wish there was a way to speed this up. I didn't find any evidence of a crash, just a looooong recovery.
Post by Jean-Daniel Cryans
Matthew,
Yes, the master won't start until the log splitting is done. By the
looks of it (such a high number of logs when the max is 32 until the
region server force flushes hlogs), it seems that your region servers
in a previous run weren't able to ensure that each HLog had 3 replicas
even tho dfs.replication was set to that number. Check your previous
LOG.warn("HDFS pipeline error detected. " +
"Found " + numCurrentReplicas + " replicas but expecting " +
this.initialReplication + " replicas. " +
" Requesting close of hlog.");
And that probably made 1 log per insert. If your still testing and
your data isn't important, feel free to kill -9 the master and nuke
the HBase HDFS folder or at least the .logs folder. If not, you'll
have to wait. Once it's done, make sure you have a proper
configuration before any other testing.
J-D
Post by Matthew LeMieux
I have a cluster of 3 machines where the NameNode is separate from the HMaster based on the distribution from Cloudera (CDH3). I have been running it successfully for a couple weeks. As of this morning, it is completely unusable. I'm looking for some help on how to fix it. Details are below. Thank you.
This morning I found HBase to be unresponsive, and tried restarting it. That didn't help. For example, running "hbase shell", and then "list" hangs.
2010-08-23 23:04:16,100 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:05:16,110 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:06:16,120 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:07:16,130 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:08:16,140 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:09:16,146 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:10:16,150 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:11:16,160 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:12:16,170 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:05:21,006 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server zookeeper:2181
2010-08-23 23:05:21,028 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to zookeeper:2181, initiating session
2010-08-23 23:05:21,168 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server zookeeper:2181, sessionid = 0x12aa0cc2520000e, negotiated timeout = 40000
2010-08-23 23:05:21,172 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event, state: SyncConnected, type: None, path: null
2010-08-23 23:05:21,177 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Set watcher on master address ZNode /hbase/master
2010-08-23 23:05:21,421 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode /hbase/master got master:60000
2010-08-23 23:05:21,421 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at master:60000 that we are up
2010-08-23 23:05:22,056 INFO org.apache.hadoop.hbase.regionserver.ShutdownHook: Installed shutdown hook thread: Shutdownhook:regionserver60020
/tmp/hbaselog$ sudo strace -p7592
Process 7592 attached - interrupt to quit
futex(0x7f65534739e0, FUTEX_WAIT, 7602, NULL
2010-08-23 23:10:06,290 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12142 of 143261: hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704435, length=1150
2010-08-23 23:10:06,290 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering filehdfs://namenode:9000/hbase/.log master,60020,1282577331142/master%3A60020.1282581704435
2010-08-23 23:10:06,510 INFO org.apache.hadoop.hbase.util.FSUtils: Finished lease recover attempt for hdfs://namenode:9000/hbase/.logs master,60020,1282577331142/master%3A60020.1282581704435
2010-08-23 23:10:06,513 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Pushed=3 entries from hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/1 master%3A60020.1282581704435
2010-08-23 23:10:06,513 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12143 of 143261: hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451, length=448
2010-08-23 23:10:06,513 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering filehdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451
2010-08-23 23:10:06,721 INFO org.apache.hadoop.hbase.util.FSUtils: Finished lease recover attempt for hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451
2010-08-23 23:10:06,723 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Pushed=2 entries from hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451
2010-08-23 23:10:06,723 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12144 of 143261: hdfs:/namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704468, length=582
2010-08-23 23:10:06,723 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering filehdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704468
It looks like the Master is sequentially going through logs up to 143261, having started at 1 and is currently at 12144. At the current rate, it will take around 12 hours to complete. Do I have to wait for it to complete before the master will recognize the region servers? If it doesn't have any region servers, then what the heck is the master doing anyway?
Thank you for your help,
Matthew
Matthew LeMieux
2010-08-25 15:05:46 UTC
Permalink
For those who are curious, using rack awareness to speed up the process of adding and removing nodes did not work in my experiment.

I set up a primary rack and a transient rack. The primary rack consists of permanent machines that always stay in the cluster, while the transient rack is meant to come and go as more processing power is needed.

Once the extra rack was no longer needed, HDFS was using up time and bandwidth to duplicate the data on the primary rack over to the transient rack rather than replicating the data on transient rack over to the primary rack. With the alternative method of putting all machines in one rack, the data on the permanent machines does not need to be copied over to the transient machines, making the process of replication much faster.

At this point, I'm considering running two separate clusters: a permanent HBase cluster and a transient M/R cluster. I could use distcp to quickly copy the data between clusters. I've seen examples of bulk loading a table using the HFileOutputFormat. But, once I've copied the files over to the M/R cluster, is there a way to read in the files, i.e., is there an HFileInputFormat equivalent?

FYI,

Matthew
Post by Jean-Daniel Cryans
Matthew,
Maybe instead of changing the replication factor, you could spin up new nodes with a different datacenter/rack configuration which would cause hadoop to ensure the replicas are not solely on those temp nodes?
J-D,
Thank you for the very fast response. I'm running in the Amazon cloud, and would like to be able to expand the cluster and reduce the cluster quickly to support hadoop jobs, so I've been playing with the replication factor and number of machines.
If I add 5 more machines for a job I'm running over night that needs some extra power, then removing those machines the next morning becomes a tedious task because some of the data generated resides perhaps exclusively on the 5 new machines. Since the HDFS node decommission process takes so long, I've been putting the replication factor high, and removing machines incrementally. For the most part, it works great because the replication speed is much faster than the decommission speed. Hadoop M/R jobs and HDFS seem completely content to deal with nodes coming and going as long as there are at least 1 replica from which to replicate. Since the Hadoop M/R jobs pull data from HBase, do some processing, and then put the data back into HBase I assumed that hosting the Hadoop M/R and HBase on the same HDFS would be faster than copying the data to a new transient cluster.
Is there a way to tell HBase that I want the replication factor to be high, but not to panic (i.e., go crazy with logs) if it isn't actually at that number? Sort of like a soft limit and a hard limit...
Thank you,
Matthew
P.S. I do need the data, I wish there was a way to speed this up. I didn't find any evidence of a crash, just a looooong recovery.
Post by Jean-Daniel Cryans
Matthew,
Yes, the master won't start until the log splitting is done. By the
looks of it (such a high number of logs when the max is 32 until the
region server force flushes hlogs), it seems that your region servers
in a previous run weren't able to ensure that each HLog had 3 replicas
even tho dfs.replication was set to that number. Check your previous
LOG.warn("HDFS pipeline error detected. " +
"Found " + numCurrentReplicas + " replicas but expecting " +
this.initialReplication + " replicas. " +
" Requesting close of hlog.");
And that probably made 1 log per insert. If your still testing and
your data isn't important, feel free to kill -9 the master and nuke
the HBase HDFS folder or at least the .logs folder. If not, you'll
have to wait. Once it's done, make sure you have a proper
configuration before any other testing.
J-D
Post by Matthew LeMieux
I have a cluster of 3 machines where the NameNode is separate from the HMaster based on the distribution from Cloudera (CDH3). I have been running it successfully for a couple weeks. As of this morning, it is completely unusable. I'm looking for some help on how to fix it. Details are below. Thank you.
This morning I found HBase to be unresponsive, and tried restarting it. That didn't help. For example, running "hbase shell", and then "list" hangs.
2010-08-23 23:04:16,100 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:05:16,110 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:06:16,120 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:07:16,130 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:08:16,140 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:09:16,146 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:10:16,150 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:11:16,160 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:12:16,170 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN
2010-08-23 23:05:21,006 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server zookeeper:2181
2010-08-23 23:05:21,028 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to zookeeper:2181, initiating session
2010-08-23 23:05:21,168 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server zookeeper:2181, sessionid = 0x12aa0cc2520000e, negotiated timeout = 40000
2010-08-23 23:05:21,172 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event, state: SyncConnected, type: None, path: null
2010-08-23 23:05:21,177 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Set watcher on master address ZNode /hbase/master
2010-08-23 23:05:21,421 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode /hbase/master got master:60000
2010-08-23 23:05:21,421 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at master:60000 that we are up
2010-08-23 23:05:22,056 INFO org.apache.hadoop.hbase.regionserver.ShutdownHook: Installed shutdown hook thread: Shutdownhook:regionserver60020
/tmp/hbaselog$ sudo strace -p7592
Process 7592 attached - interrupt to quit
futex(0x7f65534739e0, FUTEX_WAIT, 7602, NULL
2010-08-23 23:10:06,290 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12142 of 143261: hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704435, length=1150
2010-08-23 23:10:06,290 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering filehdfs://namenode:9000/hbase/.log master,60020,1282577331142/master%3A60020.1282581704435
2010-08-23 23:10:06,510 INFO org.apache.hadoop.hbase.util.FSUtils: Finished lease recover attempt for hdfs://namenode:9000/hbase/.logs master,60020,1282577331142/master%3A60020.1282581704435
2010-08-23 23:10:06,513 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Pushed=3 entries from hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/1 master%3A60020.1282581704435
2010-08-23 23:10:06,513 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12143 of 143261: hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451, length=448
2010-08-23 23:10:06,513 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering filehdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451
2010-08-23 23:10:06,721 INFO org.apache.hadoop.hbase.util.FSUtils: Finished lease recover attempt for hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451
2010-08-23 23:10:06,723 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Pushed=2 entries from hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451
2010-08-23 23:10:06,723 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12144 of 143261: hdfs:/namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704468, length=582
2010-08-23 23:10:06,723 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering filehdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704468
It looks like the Master is sequentially going through logs up to 143261, having started at 1 and is currently at 12144. At the current rate, it will take around 12 hours to complete. Do I have to wait for it to complete before the master will recognize the region servers? If it doesn't have any region servers, then what the heck is the master doing anyway?
Thank you for your help,
Matthew
Ted Yu
2010-08-25 16:34:38 UTC
Permalink
See my reply under 'importing HFiles' subject.
Post by Matthew LeMieux
For those who are curious, using rack awareness to speed up the process of
adding and removing nodes did not work in my experiment.
I set up a primary rack and a transient rack. The primary rack consists of
permanent machines that always stay in the cluster, while the transient rack
is meant to come and go as more processing power is needed.
Once the extra rack was no longer needed, HDFS was using up time and
bandwidth to duplicate the data on the primary rack over to the transient
rack rather than replicating the data on transient rack over to the primary
rack. With the alternative method of putting all machines in one rack, the
data on the permanent machines does not need to be copied over to the
transient machines, making the process of replication much faster.
At this point, I'm considering running two separate clusters: a permanent
HBase cluster and a transient M/R cluster. I could use distcp to quickly
copy the data between clusters. I've seen examples of bulk loading a table
using the HFileOutputFormat. But, once I've copied the files over to the
M/R cluster, is there a way to read in the files, i.e., is there an
HFileInputFormat equivalent?
FYI,
Matthew
Post by Jean-Daniel Cryans
Matthew,
Maybe instead of changing the replication factor, you could spin up new
nodes with a different datacenter/rack configuration which would cause
hadoop to ensure the replicas are not solely on those temp nodes?
Post by Jean-Daniel Cryans
J-D,
Thank you for the very fast response. I'm running in the Amazon
cloud, and would like to be able to expand the cluster and reduce the
cluster quickly to support hadoop jobs, so I've been playing with the
replication factor and number of machines.
Post by Jean-Daniel Cryans
If I add 5 more machines for a job I'm running over night that needs
some extra power, then removing those machines the next morning becomes a
tedious task because some of the data generated resides perhaps exclusively
on the 5 new machines. Since the HDFS node decommission process takes so
long, I've been putting the replication factor high, and removing machines
incrementally. For the most part, it works great because the replication
speed is much faster than the decommission speed. Hadoop M/R jobs and HDFS
seem completely content to deal with nodes coming and going as long as there
are at least 1 replica from which to replicate. Since the Hadoop M/R jobs
pull data from HBase, do some processing, and then put the data back into
HBase I assumed that hosting the Hadoop M/R and HBase on the same HDFS would
be faster than copying the data to a new transient cluster.
Post by Jean-Daniel Cryans
Is there a way to tell HBase that I want the replication factor to be
high, but not to panic (i.e., go crazy with logs) if it isn't actually at
that number? Sort of like a soft limit and a hard limit...
Post by Jean-Daniel Cryans
Thank you,
Matthew
P.S. I do need the data, I wish there was a way to speed this up. I
didn't find any evidence of a crash, just a looooong recovery.
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Matthew,
Yes, the master won't start until the log splitting is done. By the
looks of it (such a high number of logs when the max is 32 until the
region server force flushes hlogs), it seems that your region servers
in a previous run weren't able to ensure that each HLog had 3 replicas
even tho dfs.replication was set to that number. Check your previous
LOG.warn("HDFS pipeline error detected. " +
"Found " + numCurrentReplicas + " replicas but expecting " +
this.initialReplication + " replicas. " +
" Requesting close of hlog.");
And that probably made 1 log per insert. If your still testing and
your data isn't important, feel free to kill -9 the master and nuke
the HBase HDFS folder or at least the .logs folder. If not, you'll
have to wait. Once it's done, make sure you have a proper
configuration before any other testing.
J-D
Post by Matthew LeMieux
I have a cluster of 3 machines where the NameNode is separate from the
HMaster based on the distribution from Cloudera (CDH3). I have been
running it successfully for a couple weeks. As of this morning, it is
completely unusable. I'm looking for some help on how to fix it. Details
are below. Thank you.
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
This morning I found HBase to be unresponsive, and tried restarting it.
That didn't help. For example, running "hbase shell", and then "list"
hangs.
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
The master and region processes start up, but the master does not
recognize that the region servers are there. I am getting the following in
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
2010-08-23 23:04:16,100 INFO
org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead,
average load NaN
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
2010-08-23 23:05:16,110 INFO
org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead,
average load NaN
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
2010-08-23 23:06:16,120 INFO
org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead,
average load NaN
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
2010-08-23 23:07:16,130 INFO
org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead,
average load NaN
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
2010-08-23 23:08:16,140 INFO
org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead,
average load NaN
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
2010-08-23 23:09:16,146 INFO
org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead,
average load NaN
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
2010-08-23 23:10:16,150 INFO
org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead,
average load NaN
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
2010-08-23 23:11:16,160 INFO
org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead,
average load NaN
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
2010-08-23 23:12:16,170 INFO
org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead,
average load NaN
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
2010-08-23 23:05:21,006 INFO org.apache.zookeeper.ClientCnxn: Opening
socket connection to server zookeeper:2181
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
2010-08-23 23:05:21,028 INFO org.apache.zookeeper.ClientCnxn: Socket
connection established to zookeeper:2181, initiating session
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
2010-08-23 23:05:21,168 INFO org.apache.zookeeper.ClientCnxn: Session
establishment complete on server zookeeper:2181, sessionid =
0x12aa0cc2520000e, negotiated timeout = 40000
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
2010-08-23 23:05:21,172 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event,
state: SyncConnected, type: None, path: null
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
2010-08-23 23:05:21,177 DEBUG
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Set watcher on master
address ZNode /hbase/master
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
2010-08-23 23:05:21,421 DEBUG
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode /hbase/master
got master:60000
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
2010-08-23 23:05:21,421 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at
master:60000 that we are up
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
2010-08-23 23:05:22,056 INFO
org.apache.hadoop.hbase.regionserver.ShutdownHook: Installed shutdown hook
thread: Shutdownhook:regionserver60020
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
/tmp/hbaselog$ sudo strace -p7592
Process 7592 attached - interrupt to quit
futex(0x7f65534739e0, FUTEX_WAIT, 7602, NULL
The Master isn't idle, it appears to be trying to do some sort of
recovery having woken up to find 0 region servers. Here is an excerpt from
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
2010-08-23 23:10:06,290 DEBUG
org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12142 of
hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704435,
length=1150
Recovering filehdfs://namenode:9000/hbase/.log
master,60020,1282577331142/master%3A60020.1282581704435
Finished lease recover attempt for hdfs://namenode:9000/hbase/.logs
master,60020,1282577331142/master%3A60020.1282581704435
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
2010-08-23 23:10:06,513 DEBUG
org.apache.hadoop.hbase.regionserver.wal.HLog: Pushed=3 entries from
hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/1
master%3A60020.1282581704435
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
2010-08-23 23:10:06,513 DEBUG
org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12143 of
hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451,
length=448
Recovering
filehdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451
Finished lease recover attempt for
hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
2010-08-23 23:10:06,723 DEBUG
org.apache.hadoop.hbase.regionserver.wal.HLog: Pushed=2 entries from
hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
2010-08-23 23:10:06,723 DEBUG
org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12144 of
hdfs:/namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704468,
length=582
Recovering
filehdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704468
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
It looks like the Master is sequentially going through logs up to
143261, having started at 1 and is currently at 12144. At the current
rate, it will take around 12 hours to complete. Do I have to wait for it to
complete before the master will recognize the region servers? If it doesn't
have any region servers, then what the heck is the master doing anyway?
Post by Jean-Daniel Cryans
Post by Jean-Daniel Cryans
Post by Matthew LeMieux
Thank you for your help,
Matthew
Stack
2010-08-25 17:29:24 UTC
Permalink
Post by Matthew LeMieux
For those who are curious, using rack awareness to speed up the process of adding and removing nodes did not work in my experiment.
Once the extra rack was no longer needed, HDFS was using up time and bandwidth to duplicate the data on the primary rack over to the transient rack rather than replicating the data on transient rack over to the primary rack.  With the alternative method of putting all machines in one rack, the data on the permanent machines does not need to be copied over to the transient machines, making the process of replication much faster.
Would adding a custom script for rack awareness help here?
Post by Matthew LeMieux
But, once I've copied the files over to the M/R cluster, is there a way to read in the files, i.e., is there an HFileInputFormat equivalent?
No. We don't have such a beastie at the moment. It'd be a little
tricky to write in that it would in essence need to keep the order and
merge the content of all files in the same way as its done inside in
the HRegion class that floats in a running HRegionServer (maybe whats
needed is a HRegionInputFormat?).

You'll need to bring up an hbase instance on the M/R cluster if you
want to run M/R against the hbase content. Or, just M/R across racks?

St.Ack
Matthew LeMieux
2010-08-25 19:03:09 UTC
Permalink
Rack assignment isn't a problem. The problem with removing the transient rack is really an HDFS problem - how to coax it into copying at least 1 replica of all blocks onto the permanent rack so that the transient rack can be removed form the cluster without getting corrupt blocks. ... and to do it "quickly" so that machine instances can be returned to the cloud...

The use model is three fold: (I would venture to guess I'm not the only one trying to do this sort of thing)

1) to have HBase answer queries in real-time from clients (i.e., analysts and web servers)
2) write data (i.e., from a data stream, or from the web servers)
3) run processing jobs on that data (i.e., M/R job), that pull some data out, play with it, and then send some updates (puts) back

I was hoping to accomplish 1 and 2 with "permanent" machine instances, and accomplish 3 with "transient" machine instances that are only part of the cluster for the time the processing jobs are running.

I'm learning that 3 should probably be a completely separate "transient" cluster. It looks like the fastest way to get the data out of the permanent cluster is to run a job on the transient cluster, using a TableInputFormat. This way, the transient cluster doesn't need to run an HMaster or HRegion, it just needs the HBase libraries and would be a client only.

Is this what you were suggesting?

-Matthew
Post by Stack
Post by Matthew LeMieux
For those who are curious, using rack awareness to speed up the process of adding and removing nodes did not work in my experiment.
Once the extra rack was no longer needed, HDFS was using up time and bandwidth to duplicate the data on the primary rack over to the transient rack rather than replicating the data on transient rack over to the primary rack. With the alternative method of putting all machines in one rack, the data on the permanent machines does not need to be copied over to the transient machines, making the process of replication much faster.
Would adding a custom script for rack awareness help here?
Post by Matthew LeMieux
But, once I've copied the files over to the M/R cluster, is there a way to read in the files, i.e., is there an HFileInputFormat equivalent?
No. We don't have such a beastie at the moment. It'd be a little
tricky to write in that it would in essence need to keep the order and
merge the content of all files in the same way as its done inside in
the HRegion class that floats in a running HRegionServer (maybe whats
needed is a HRegionInputFormat?).
You'll need to bring up an hbase instance on the M/R cluster if you
want to run M/R against the hbase content. Or, just M/R across racks?
St.Ack
Continue reading on narkive:
Loading...