Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> [question about replication] how to apply delta from Master to Slave after crash ?


Copy link to this message
-
Re: [question about replication] how to apply delta from Master to Slave after crash ?
https://issues.apache.org/jira/browse/HBASE-9047

On Fri, Jul 26, 2013 at 12:59 PM, Jean-Daniel Cryans
<[EMAIL PROTECTED]> wrote:
> I guess I didn't explain my ideas clearly.
>
> So, first, replication in HBase is master-push, so you don't want to
> reverse the process. It means that this tool needs to run on the
> master cluster.
>
> Then I don't think you need to specify a timestamp since the
> replication state is in ZK. Basically that tool we're talking about
> would be able to read the replication state of each master region
> server, finish replicating what's missing, and then clear that state
> in zookeeper.
>
> The code that handles replication does most of that already. Check
> ReplicationSourceManager and ReplicationSource. Basically when
> ReplicationSourceManager.init() is called, it will check all the
> queues in ZK and try to grab those that aren't attached to a region
> server. If the whole cluster is down, it will grab all of them.
>
> The beautiful thing here is that you could start that tool on all your
> machines and the load will be spread out, but that might not be a big
> concern if replication wasn't lagging since it would take a few
> seconds to finish replicating the missing data for each region server.
>
> I'll open a jira.
>
> J-D
>
> On Fri, Jul 26, 2013 at 11:50 AM, Demai Ni <[EMAIL PROTECTED]> wrote:
>> JD,
>>
>> yeah. that sounds what I will need to do. a tool like this
>> [slave_cluster]$tool_to_syncup master_ZKquorum table_name start_timestamp
>>
>> so two tasks for me:
>> 1) identify the start_timestamp
>> 2) write the tool_to_syncup which will reach to master_ZK, copy the HLOGs
>> from makster, replay the HLOGs on Slave.
>>
>> are you aware of some example code for the 2) task that I can leverage?
>> thanks
>>
>> Demai
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB