Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Need help regarding HDFS-RAID


Copy link to this message
-
Re: Need help regarding HDFS-RAID
Hi Dhruba,

Thanks for the pointer. I'm going to try and pull this code into our internal 20-ish distro. Would you object if I make a contribution of that result if it is successful?
Best regards,
    - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)

>________________________________
>From: Dhruba Borthakur <[EMAIL PROTECTED]>
>To: Andrew Purtell <[EMAIL PROTECTED]>
>Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
>Sent: Tuesday, September 20, 2011 2:18 AM
>Subject: Re: Need help regarding HDFS-RAID
>
>
>Hi andy,
>
>
>we do run a version of HDFS RAID that is backported from Apache trunk to a 0.20 based release. Our code is in https://github.com/facebook/hadoop-20-warehouse/tree/master/src/contrib/raid
>But I do not have an elegant way to contribute this code to Apache 0.20.2xx.x. 
>
>
>thanks,
>dhruba
>
>
>On Sat, Sep 17, 2011 at 9:16 AM, Andrew Purtell <[EMAIL PROTECTED]> wrote:
>
>Hi Dhruba,
>>
>>
>>Would you consider a contribution of this to branch-0.20-security aka 0.20.2xx.x?
>>
>>
>>If I am mistaken and you do not have a 0.22-ish HDFS RAID backported to an 0.20-ish platform, please disregard.
>>
>>
>>Best regards,
>>
>>
>>    - Andy
>>
>>Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
>>
>>
>>>________________________________
>>>From: Dhruba Borthakur <[EMAIL PROTECTED]>
>>>To: [EMAIL PROTECTED]; Andrew Purtell <[EMAIL PROTECTED]>
>>>Sent: Thursday, September 15, 2011 10:14 AM
>>>
>>>Subject: Re: Need help regarding HDFS-RAID
>>>
>>>
>>>
>>>That's right Andy. 0.22+. We are running a HDFS-RAID code base that is pretty close to what is available in Apache hdfs trunk.
>>>
>>>
>>>-dhruba
>>>
>>>
>>>On Thu, Sep 15, 2011 at 10:08 AM, Andrew Purtell <[EMAIL PROTECTED]> wrote:
>>>
>>>But that is the HDFS RAID effectively in 0.22+, not 0.21, right Dhruba?
>>>>
>>>> 
>>>>Best regards,
>>>>
>>>>
>>>>       - Andy
>>>>
>>>>Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
>>>>
>>>>
>>>>>________________________________
>>>>>From: Dhruba Borthakur <[EMAIL PROTECTED]>
>>>>>To: [EMAIL PROTECTED]
>>>>>Sent: Thursday, September 15, 2011 10:06 AM
>>>>>Subject: Re: Need help regarding HDFS-RAID
>>>>>
>>>>>
>>>>>
>>>>>We use HDFS RAID in a big way. Data older than 12 days are RAIDED using XOR encoding (effective replication of 2.5). Data older than a few months are raided using ReedSolomon (effective observed replication factor of 1.5). This is running on our 60 PB size cluster for about an year now.
>>>>>
>>>>>
>>>>>thanks
>>>>>dhruba
>>>>>
>>>>>
>>>>>
>>>>>On Thu, Sep 15, 2011 at 5:31 AM, Ajit Ratnaparkhi <[EMAIL PROTECTED]> wrote:
>>>>>
>>>>>Hi,
>>>>>>
>>>>>>
>>>>>>We were planning to use it for past data archival(instead of moving it to archival store).
>>>>>>Archiving it in HDFS gives advantage of making it easily available for processing whenever required.
>>>>>>
>>>>>>
>>>>>>Is there any archival solution in hadoop ecosystem?
>>>>>>
>>>>>>
>>>>>>thanks,
>>>>>>Ajit.
>>>>>>
>>>>>>
>>>>>>
>>>>>>On Thu, Sep 15, 2011 at 5:05 PM, Harsh J <[EMAIL PROTECTED]> wrote:
>>>>>>
>>>>>>Hey Ajit,
>>>>>>>
>>>>>>>HDFS-RAID was never part of the 0.20 release. It made its debut in the
>>>>>>>0.21 release [1]. I know that Facebook uses it (and also did develop
>>>>>>>it), but unsure of users beyond Facebook.
>>>>>>>
>>>>>>>While 0.21 overall is not entirely deemed as production-usable yet
>>>>>>>(and is in fact, possibly abandoned for efforts on 0.22+), you can
>>>>>>>give that release a whirl on a test cluster and see for yourself if
>>>>>>>your need beats the stability.
>>>>>>>
>>>>>>>Just curious though - why are you looking to use this specifically?
>>>>>>>
>>>>>>>[1] - http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.21/mapreduce/src/contrib/raid/
>>>>>>>
>>>>>>>
>>>>>>>On Thu, Sep 15, 2011 at 4:37 PM, Ajit Ratnaparkhi
>>>>>>><[EMAIL PROTECTED]> wrote:
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB