Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> RE: is HDFS RAID "data locality" efficient?


+
Sourygna Luangsay 2012-08-09, 10:55
+
Michael Segel 2012-08-09, 10:34
Copy link to this message
-
Re: is HDFS RAID "data locality" efficient?
Nice explanation guys .. thanks

Syed Abdul kather
send from Samsung S3
On Aug 9, 2012 12:02 AM, "Ajit Ratnaparkhi [via Lucene]" <
ml-node+[EMAIL PROTECTED]> wrote:

> Agreed with Steve.
> That is most important use of HDFS RAID, where you consume less disk space
> with same reliability and availability guarantee at cost of processing
> performance. Most of data in hdfs is cold data, without HDFS RAID you end
> up maintaining 3 replicas of data which is hardly going to be processed
> again, but you cant remove/move this data to separate archive because if
>  required processing should be as soon as possible.
>
> -Ajit
>
> On Wed, Aug 8, 2012 at 11:01 PM, Steve Loughran <[hidden email]<http://user/SendEmail.jtp?type=node&node=3999922&i=0>
> > wrote:
>
>>
>>
>> On 8 August 2012 09:46, Sourygna Luangsay <[hidden email]<http://user/SendEmail.jtp?type=node&node=3999922&i=1>
>> > wrote:
>>
>>>  Hi folks!****
>>>
>>> One of the scenario I can think in order to take advantage of HDFS RAID
>>> without suffering this penalty is:**
>>>
>>> **-          **Using normal HDFS with default replication=3 for my
>>> “fresh data”****
>>>
>>> **-          **Using HDFS RAID for my historical data (that is barely
>>> used by M/R)****
>>>
>>> ** **
>>>
>>>
>>>
>> exactly: less space use on cold data, with the penalty that access
>> performance can be worse. As the majority of data on a hadoop cluster is
>> usually "cold", it's a space and power efficient story for the archive data
>>
>> --
>> Steve Loughran
>> Hortonworks Inc
>>
>>
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/is-HDFS-RAID-data-locality-efficient-tp3999891p3999922.html
>  To unsubscribe from Lucene, click here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=472066&code=aW4uYWJkdWxAZ21haWwuY29tfDQ3MjA2NnwxMDczOTUyNDEw>
> .
> NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
-----
THANKS AND REGARDS,
SYED ABDUL KATHER
--
View this message in context: http://lucene.472066.n3.nabble.com/is-HDFS-RAID-data-locality-efficient-tp3999891p3999924.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB