Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> hbase replication and dfs replication?


Copy link to this message
-
Re: hbase replication and dfs replication?
Jason,

HBase replication is for between two HBase clusters as you state.

What you are seeing is merely the expected behavior within a single
cluster.  DFS replication is not involved directly here - the shell ends up
acting like any other HBase client and constructing the scan the same way
(i.e. finding the right region servers to do the scan by contacting ZK,
region server serving .META., issuing the scan requests to the proper RSes,
etc.).  It doesn't matter where you are running the client from.

There is no "replicating HBase tables" within the same cluster - you're
just accessing the same table from different clients.

Hope this helps,

- Dave
On Thu, Jun 27, 2013 at 7:04 AM, Jason Huang <[EMAIL PROTECTED]> wrote:

> Hello,
>
> I am a bit confused how configurations of hbase replication and dfs
> replication works together.
>
> My application deploys on an HBase cluster (0.94.3) with two Region
> servers. The two hadoop datanodes run on the same two Region severs.
>
> Because we only have two datanodes, dfs.replication was set to 2.
>
> The person who configured the small cluster didn't explicitly set the hbase
> replication configs, which includes:
>
> (1) in ${HBASE_HOME}/conf/hbase-site.xml, hbase.replication is not set. I
> think the default value is "false" according to
> http://hbase.apache.org/replication.html.
>
> (2) in the table,Replication_Scope is set to 0 (by default).
>
> However, even without setting hbase.replication and replication_scope, it
> appears that the tables are duplicated in the two Region servers (as I can
> go to the shells of these two region servers and find the duplicate rows
> from a scan).
>
> My question is - does the default dfs replication takes care of replicating
> hbase tables within the same cluster so we don't need to set up the hbase
> replication configs? And only when we need to replicate hbase from one
> cluster to another cluster should we set up the hbase replication configs
> (1) and (2) above?
>
> thanks!
>
> Jason
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB