Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Best practice to migrate HDFS from 0.20.205 to CDH3u3


Copy link to this message
-
Re: Best practice to migrate HDFS from 0.20.205 to CDH3u3
Honestly that is a hassle, going from 205 to cdh3u3 is probably more
or a cross-grade then an upgrade or downgrade. I would just stick it
out. But yes like Michael said two clusters on the same gear and
distcp. If you are using RF=3 you could also lower your replication to
rf=2 'hadoop dfs -setrepl 2' to clear headroom as you are moving
stuff.
On Thu, May 3, 2012 at 7:25 AM, Michel Segel <[EMAIL PROTECTED]> wrote:
> Ok... When you get your new hardware...
>
> Set up one server as your new NN, JT, SN.
> Set up the others as a DN.
> (Cloudera CDH3u3)
>
> On your existing cluster...
> Remove your old log files, temp files on HDFS anything you don't need.
> This should give you some more space.
> Start copying some of the directories/files to the new cluster.
> As you gain space, decommission a node, rebalance, add node to new cluster...
>
> It's a slow process.
>
> Should I remind you to make sure you up you bandwidth setting, and to clean up the hdfs directories when you repurpose the nodes?
>
> Does this make sense?
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On May 3, 2012, at 5:46 AM, Austin Chungath <[EMAIL PROTECTED]> wrote:
>
>> Yeah I know :-)
>> and this is not a production cluster ;-) and yes there is more hardware
>> coming :-)
>>
>> On Thu, May 3, 2012 at 4:10 PM, Michel Segel <[EMAIL PROTECTED]>wrote:
>>
>>> Well, you've kind of painted yourself in to a corner...
>>> Not sure why you didn't get a response from the Cloudera lists, but it's a
>>> generic question...
>>>
>>> 8 out of 10 TB. Are you talking effective storage or actual disks?
>>> And please tell me you've already ordered more hardware.. Right?
>>>
>>> And please tell me this isn't your production cluster...
>>>
>>> (Strong hint to Strata and Cloudea... You really want to accept my
>>> upcoming proposal talk... ;-)
>>>
>>>
>>> Sent from a remote device. Please excuse any typos...
>>>
>>> Mike Segel
>>>
>>> On May 3, 2012, at 5:25 AM, Austin Chungath <[EMAIL PROTECTED]> wrote:
>>>
>>>> Yes. This was first posted on the cloudera mailing list. There were no
>>>> responses.
>>>>
>>>> But this is not related to cloudera as such.
>>>>
>>>> cdh3 is based on apache hadoop 0.20 as the base. My data is in apache
>>>> hadoop 0.20.205
>>>>
>>>> There is an upgrade namenode option when we are migrating to a higher
>>>> version say from 0.20 to 0.20.205
>>>> but here I am downgrading from 0.20.205 to 0.20 (cdh3)
>>>> Is this possible?
>>>>
>>>>
>>>> On Thu, May 3, 2012 at 3:25 PM, Prashant Kommireddi <[EMAIL PROTECTED]
>>>> wrote:
>>>>
>>>>> Seems like a matter of upgrade. I am not a Cloudera user so would not
>>> know
>>>>> much, but you might find some help moving this to Cloudera mailing list.
>>>>>
>>>>> On Thu, May 3, 2012 at 2:51 AM, Austin Chungath <[EMAIL PROTECTED]>
>>>>> wrote:
>>>>>
>>>>>> There is only one cluster. I am not copying between clusters.
>>>>>>
>>>>>> Say I have a cluster running apache 0.20.205 with 10 TB storage
>>> capacity
>>>>>> and has about 8 TB of data.
>>>>>> Now how can I migrate the same cluster to use cdh3 and use that same 8
>>> TB
>>>>>> of data.
>>>>>>
>>>>>> I can't copy 8 TB of data using distcp because I have only 2 TB of free
>>>>>> space
>>>>>>
>>>>>>
>>>>>> On Thu, May 3, 2012 at 3:12 PM, Nitin Pawar <[EMAIL PROTECTED]>
>>>>>> wrote:
>>>>>>
>>>>>>> you can actually look at the distcp
>>>>>>>
>>>>>>> http://hadoop.apache.org/common/docs/r0.20.0/distcp.html
>>>>>>>
>>>>>>> but this means that you have two different set of clusters available
>>> to
>>>>>> do
>>>>>>> the migration
>>>>>>>
>>>>>>> On Thu, May 3, 2012 at 12:51 PM, Austin Chungath <[EMAIL PROTECTED]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks for the suggestions,
>>>>>>>> My concerns are that I can't actually copyToLocal from the dfs
>>>>> because
>>>>>>> the
>>>>>>>> data is huge.
>>>>>>>>
>>>>>>>> Say if my hadoop was 0.20 and I am upgrading to 0.20.205 I can do a
>>>>>>>> namenode upgrade. I don't have to copy data out of dfs.