Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> confused info about region-regionserver locality


+
KIM JUN YOUNG 2013-04-04, 10:01
+
Jean-Marc Spaggiari 2013-04-04, 15:25
+
lars hofhansl 2013-04-04, 18:04
+
Jean-Marc Spaggiari 2013-04-04, 18:24
Copy link to this message
-
Re: confused info about region-regionserver locality
>> When the write request returns to the client there will be a local
copy, a copy on another machine in the same, and a copy on a machine in a different rack, who cares about the ordering inside the pipeline?
> Not necessary. There might not be any additional copy on a different
> machine on the same rack. BUT.. As you said, who cares ;) As long as
> we have the local copy and some replicas.

Really? Doesn't the whole pipeline have to be successful in order to return success to the client.
(I might be confused :) )

________________________________
 From: Jean-Marc Spaggiari <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
Sent: Thursday, April 4, 2013 11:24 AM
Subject: Re: confused info about region-regionserver locality
 
>Isn't this done via pipelining anyway?
Yes, it's the way it's done.

>So there's no notion of ordering with respect 1st, 2nd, and 3rd block, either all writes go through the pipeline or none are.
Still correct.

> When the write request returns to the client there will be a local copy, a copy on another machine in the same, and a copy on a machine in a different rack, who cares about the ordering inside the pipeline?
Not necessary. There might not be any additional copy on a different
machine on the same rack. BUT.. As you said, who cares ;) As long as
we have the local copy and some replicas.

I have updated the documentation already. I will open the JIRA and
submit. I have also added subsequent replicas in case replication
factor is > 3.

JM

2013/4/4 lars hofhansl <[EMAIL PROTECTED]>:
> Isn't this done via pipelining anyway?
> So there's no notion of ordering with respect 1st, 2nd, and 3rd block, either all writes go through the pipeline or none are.
>
> When the write request returns to the client there will be a local copy, a copy on another machine in the same, and a copy on a machine in a different rack, who cares about the ordering inside the pipeline?
>
>
> Seems it would also be inefficient to pipeline from the local rack to another another one and then in the same pipeline back into the local rack (more load on the switch connecting the racks with no benefit).
>
> I'll double check.
>
>
> -- Lars
>
>
>
> ________________________________
>  From: Jean-Marc Spaggiari <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Sent: Thursday, April 4, 2013 8:25 AM
> Subject: Re: confused info about region-regionserver locality
>
> Hi,
>
> I think you're right and documentation need to be updated.
>
> The 3rd replica is written on a random node in the same rack as the
> 2nd replica. I will double check. Can you please open a JIRA so this
> is updated?
>
> JM
>
> 2013/4/4 KIM JUN YOUNG <[EMAIL PROTECTED]>:
>> Hi All.
>>
>> There is confused understanding about region-regionser locality.
>>
>> from the current document ,
>>
>> http://hbase.apache.org/book/regions.arch.html
>> 9.7.3. Region-RegionServer Locality
>> Over time, Region-RegionServer locality is achieved via HDFS block replication. The HDFS client does the following by default when choosing locations to write replicas:
>>
>> First replica is written to local node
>> Second replica is written to another node in same rack
>> Third replica is written to a node in another rack (if sufficient nodes)
>>
>>
>> but, my understanding is different
>> HDFS write blocks for replica
>>
>>         first, local node
>>         second, another node in another rack
>>         third, random another node in same rack
>>
>> need to be changed? or am I missing something?
+
Dave Wang 2013-04-04, 18:50
+
Jean-Marc Spaggiari 2013-04-04, 18:30
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB