Post by Rong-en Fan Post by stack Post by Rong-en Fan
I did so. I even rm -rf on dfs's dir and do namenode -format
before starting my dfs. hadoop fsck reports the default replication
is 1, avg. block replication is 2.9x after I wrote some data into
hbase. The underlying dfs is used by hbase. No other apps on
What if you add a file using './bin/hadoop fs ....' -- i.e. don't have
hbase in the mix at all -- does the file show as replicated?
It's 1 replication.
Post by stack
If you copy your hadoop-conf.xml to $HBASE_HOME/conf, does it then do the
right thing? Maybe whats happening is that hbase writing files, we're using
Yes, I can verify by doing so, HBase respects my customized config. Shall I file
a JIRA against HBase or Hadoop itself?
When HBase was in hadoop/contrib, the hbase script set both HADOOP_CONF_DIR
and HBASE_CONF_DIR to CLASSPATH, so that dfs's configuration can be loaded
correctly. However, when moved out hadoop/contrib, it only sets HBASE_CONF_DIR.
I can think of several possible solutions:
1) set HADOOP_CONF_DIR in hbase-env.sh, then add HADOOP_CONF_DIR to
CLASSPATH as before
2) Instruct user to create links for hadoop-*.xml if they want to
customize some dfs settings.
3) If only a small set of dfs confs are related to dfs's client, maybe
they can be set via
hbase-site.xml, then hbase sets these for us when create a FileSystem obj.
Post by Rong-en Fan
Post by stack Post by Rong-en Fan
Hmm... as far as I understand the hadoop FileSystem, you can
specify # of replication when creating a file. But I did not find hbase
use it, correct?
We don't do it explicitly, but as I suggest above, we're probably using
defaults instead of your custom config.