Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - hbase replication and dfs replication?


+
Jason Huang 2013-06-27, 14:04
+
Dave Wang 2013-06-27, 14:26
Copy link to this message
-
Re: hbase replication and dfs replication?
Jason Huang 2013-06-27, 14:27
makes a lot of sense.

thanks Dave,

Jason

On Thu, Jun 27, 2013 at 10:26 AM, Dave Wang <[EMAIL PROTECTED]> wrote:

> Jason,
>
> HBase replication is for between two HBase clusters as you state.
>
> What you are seeing is merely the expected behavior within a single
> cluster.  DFS replication is not involved directly here - the shell ends up
> acting like any other HBase client and constructing the scan the same way
> (i.e. finding the right region servers to do the scan by contacting ZK,
> region server serving .META., issuing the scan requests to the proper RSes,
> etc.).  It doesn't matter where you are running the client from.
>
> There is no "replicating HBase tables" within the same cluster - you're
> just accessing the same table from different clients.
>
> Hope this helps,
>
> - Dave
>
>
> On Thu, Jun 27, 2013 at 7:04 AM, Jason Huang <[EMAIL PROTECTED]>
> wrote:
>
> > Hello,
> >
> > I am a bit confused how configurations of hbase replication and dfs
> > replication works together.
> >
> > My application deploys on an HBase cluster (0.94.3) with two Region
> > servers. The two hadoop datanodes run on the same two Region severs.
> >
> > Because we only have two datanodes, dfs.replication was set to 2.
> >
> > The person who configured the small cluster didn't explicitly set the
> hbase
> > replication configs, which includes:
> >
> > (1) in ${HBASE_HOME}/conf/hbase-site.xml, hbase.replication is not set. I
> > think the default value is "false" according to
> > http://hbase.apache.org/replication.html.
> >
> > (2) in the table,Replication_Scope is set to 0 (by default).
> >
> > However, even without setting hbase.replication and replication_scope, it
> > appears that the tables are duplicated in the two Region servers (as I
> can
> > go to the shells of these two region servers and find the duplicate rows
> > from a scan).
> >
> > My question is - does the default dfs replication takes care of
> replicating
> > hbase tables within the same cluster so we don't need to set up the hbase
> > replication configs? And only when we need to replicate hbase from one
> > cluster to another cluster should we set up the hbase replication configs
> > (1) and (2) above?
> >
> > thanks!
> >
> > Jason
> >
>