Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> confused info about region-regionserver locality


Copy link to this message
-
Re: confused info about region-regionserver locality
I have created HBASE-8269 for this documentation update.

2013/4/4 Jean-Marc Spaggiari <[EMAIL PROTECTED]>:
>>Isn't this done via pipelining anyway?
> Yes, it's the way it's done.
>
>>So there's no notion of ordering with respect 1st, 2nd, and 3rd block, either all writes go through the pipeline or none are.
> Still correct.
>
>> When the write request returns to the client there will be a local copy, a copy on another machine in the same, and a copy on a machine in a different rack, who cares about the ordering inside the pipeline?
> Not necessary. There might not be any additional copy on a different
> machine on the same rack. BUT.. As you said, who cares ;) As long as
> we have the local copy and some replicas.
>
> I have updated the documentation already. I will open the JIRA and
> submit. I have also added subsequent replicas in case replication
> factor is > 3.
>
> JM
>
> 2013/4/4 lars hofhansl <[EMAIL PROTECTED]>:
>> Isn't this done via pipelining anyway?
>> So there's no notion of ordering with respect 1st, 2nd, and 3rd block, either all writes go through the pipeline or none are.
>>
>> When the write request returns to the client there will be a local copy, a copy on another machine in the same, and a copy on a machine in a different rack, who cares about the ordering inside the pipeline?
>>
>>
>> Seems it would also be inefficient to pipeline from the local rack to another another one and then in the same pipeline back into the local rack (more load on the switch connecting the racks with no benefit).
>>
>> I'll double check.
>>
>>
>> -- Lars
>>
>>
>>
>> ________________________________
>>  From: Jean-Marc Spaggiari <[EMAIL PROTECTED]>
>> To: [EMAIL PROTECTED]
>> Sent: Thursday, April 4, 2013 8:25 AM
>> Subject: Re: confused info about region-regionserver locality
>
>
>>
>> Hi,
>>
>> I think you're right and documentation need to be updated.
>>
>> The 3rd replica is written on a random node in the same rack as the
>> 2nd replica. I will double check. Can you please open a JIRA so this
>> is updated?
>>
>> JM
>>
>> 2013/4/4 KIM JUN YOUNG <[EMAIL PROTECTED]>:
>>> Hi All.
>>>
>>> There is confused understanding about region-regionser locality.
>>>
>>> from the current document ,
>>>
>>> http://hbase.apache.org/book/regions.arch.html
>>> 9.7.3. Region-RegionServer Locality
>>> Over time, Region-RegionServer locality is achieved via HDFS block replication. The HDFS client does the following by default when choosing locations to write replicas:
>>>
>>> First replica is written to local node
>>> Second replica is written to another node in same rack
>>> Third replica is written to a node in another rack (if sufficient nodes)
>>>
>>>
>>> but, my understanding is different
>>> HDFS write blocks for replica
>>>
>>>         first, local node
>>>         second, another node in another rack
>>>         third, random another node in same rack
>>>
>>> need to be changed? or am I missing something?
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB