Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> Need help regarding HDFS-RAID


+
Ajit Ratnaparkhi 2011-09-15, 11:07
+
Harsh J 2011-09-15, 11:35
+
Ajit Ratnaparkhi 2011-09-15, 12:31
+
Dhruba Borthakur 2011-09-15, 17:06
+
Andrew Purtell 2011-09-15, 17:08
+
Dhruba Borthakur 2011-09-15, 17:14
+
Ajit Ratnaparkhi 2011-09-15, 17:54
+
Andrew Purtell 2011-09-15, 18:01
+
Ajit Ratnaparkhi 2011-09-16, 05:43
+
Andrew Purtell 2011-09-17, 16:16
+
Dhruba Borthakur 2011-09-20, 09:18
+
Ajit Ratnaparkhi 2011-09-20, 13:49
+
Andrew Purtell 2011-09-20, 16:03
+
Dhruba Borthakur 2011-09-20, 16:49
Copy link to this message
-
Re: Need help regarding HDFS-RAID
> I will be very grateful to you if you merge and contribute it to Apache Hadoop 0.20.2xx.x.

Hmm... I see what you mean. I was naive about what is "branch-20-warehouse". I was looking for an updated HDFS RAID that incorporated R-S coding but ran against a 20-ish HDFS. I suppose it is relatively easy to have a HDFS RAID close to what is in trunk if HDFS has evolved in your branch. :-)
It looks like the changes to HDFS can be teased apart as:

  - BlockMissingException

  - Listing file status and block locations: LocatedFileStatus, FileSystem.listLocatedStatus
  - Corrupt file reporting
     - Changes to FSNameSystem and UnderReplicatedBlocks for tracking and reporting corrupt blocks

     - Update to the ClientProtocol for listing corrupt file blocks: listCorruptFileBlocks()

     - DFSUtil.getCorruptFiles
  - Change visibility and constructor for datanode.BlockSender so RAID can send repaired blocks without needing to be a DataNode or without reimplementing the packet protocol
  - A set of quite invasive changes to the NameNode dealing with pluggable block placement policies, but RAID could possibly live without this, the PlacementMonitor would have more work to do in that case
I suppose the upside to any consideration for back porting all of this into an 0.20.2xx is all of the above has already gone through trunk.
Best regards,

    - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
>________________________________
>From: Dhruba Borthakur <[EMAIL PROTECTED]>
>To: [EMAIL PROTECTED]; Andrew Purtell <[EMAIL PROTECTED]>
>Sent: Tuesday, September 20, 2011 9:49 AM
>Subject: Re: Need help regarding HDFS-RAID
>
>
>Hi Andy,
>
>
>I will be very grateful to you if you merge and contribute it to Apache Hadoop 0.20.2xx.x.
>
>
>thanks,
>dhruba
>
>
>On Tue, Sep 20, 2011 at 9:03 AM, Andrew Purtell <[EMAIL PROTECTED]> wrote:
>
>Hi Dhruba,
>>
>>Thanks for the pointer. I'm going to try and pull this code into our internal 20-ish distro. Would you object if I make a contribution of that result if it is successful?
>>
>>
>>
>>Best regards,
>>
>>
>>    - Andy
>>
>>Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
>>
>>>________________________________
>>>From: Dhruba Borthakur <[EMAIL PROTECTED]>
>>>To: Andrew Purtell <[EMAIL PROTECTED]>
>>>Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
>>>Sent: Tuesday, September 20, 2011 2:18 AM
>>
>>>Subject: Re: Need help regarding HDFS-RAID
>>>
>>>
>>>Hi andy,
>>>
>>>
>>>we do run a version of HDFS RAID that is backported from Apache trunk to a 0.20 based release. Our code is in https://github.com/facebook/hadoop-20-warehouse/tree/master/src/contrib/raid
>>>But I do not have an elegant way to contribute this code to Apache 0.20.2xx.x. 
>>>
>>>
>>>thanks,
>>>dhruba
>>>
>>>
>>>On Sat, Sep 17, 2011 at 9:16 AM, Andrew Purtell <[EMAIL PROTECTED]> wrote:
>>>
>>>Hi Dhruba,
>>>>
>>>>
>>>>Would you consider a contribution of this to branch-0.20-security aka 0.20.2xx.x?
>>>>
>>>>
>>>>If I am mistaken and you do not have a 0.22-ish HDFS RAID backported to an 0.20-ish platform, please disregard.
>>>>
>>>>
>>>>Best regards,
>>>>
>>>>
>>>>    - Andy
>>>>
>>>>Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
>>>>
>>>>
>>>>>________________________________
>>>>>From: Dhruba Borthakur <[EMAIL PROTECTED]>
>>>>>To: [EMAIL PROTECTED]; Andrew Purtell <[EMAIL PROTECTED]>
>>>>>Sent: Thursday, September 15, 2011 10:14 AM
>>>>>
>>>>>Subject: Re: Need help regarding HDFS-RAID
>>>>>
>>>>>
>>>>>
>>>>>That's right Andy. 0.22+. We are running a HDFS-RAID code base that is pretty close to what is available in Apache hdfs trunk.
>>>>>
>>>>>
>>>>>-dhruba
>>>>>
>>>>>
>>>>>On Thu, Sep 15, 2011 at 10:08 AM, Andrew Purtell <[EMAIL PROTECTED]> wrote:
>>>>>
>>>>>But that is the HDFS RAID effectively in 0.22+, not 0.21, right Dhruba?
>>>>>>
>>>>>> 
>>>>>>Best regards,