Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Best practice to migrate HDFS from 0.20.205 to CDH3u3


+
Austin Chungath 2012-05-03, 06:11
+
Nitin Pawar 2012-05-03, 06:53
+
Austin Chungath 2012-05-03, 07:21
+
Nitin Pawar 2012-05-03, 09:42
+
Austin Chungath 2012-05-03, 09:51
+
Prashant Kommireddi 2012-05-03, 09:55
+
Austin Chungath 2012-05-03, 10:25
+
Michel Segel 2012-05-03, 10:40
+
Austin Chungath 2012-05-03, 10:46
+
Michel Segel 2012-05-03, 11:25
+
Edward Capriolo 2012-05-03, 15:25
Copy link to this message
-
Re: Best practice to migrate HDFS from 0.20.205 to CDH3u3
This probably is a more relevant question in CDH mailing lists. That said,
what Edward is suggesting seems reasonable. Reduce replication factor,
decommission some of the nodes and create a new cluster with those nodes
and do distcp.

Could you share with us the reasons you want to migrate from Apache 205?

Regards,
Suresh

On Thu, May 3, 2012 at 8:25 AM, Edward Capriolo <[EMAIL PROTECTED]>wrote:

> Honestly that is a hassle, going from 205 to cdh3u3 is probably more
> or a cross-grade then an upgrade or downgrade. I would just stick it
> out. But yes like Michael said two clusters on the same gear and
> distcp. If you are using RF=3 you could also lower your replication to
> rf=2 'hadoop dfs -setrepl 2' to clear headroom as you are moving
> stuff.
>
>
> On Thu, May 3, 2012 at 7:25 AM, Michel Segel <[EMAIL PROTECTED]>
> wrote:
> > Ok... When you get your new hardware...
> >
> > Set up one server as your new NN, JT, SN.
> > Set up the others as a DN.
> > (Cloudera CDH3u3)
> >
> > On your existing cluster...
> > Remove your old log files, temp files on HDFS anything you don't need.
> > This should give you some more space.
> > Start copying some of the directories/files to the new cluster.
> > As you gain space, decommission a node, rebalance, add node to new
> cluster...
> >
> > It's a slow process.
> >
> > Should I remind you to make sure you up you bandwidth setting, and to
> clean up the hdfs directories when you repurpose the nodes?
> >
> > Does this make sense?
> >
> > Sent from a remote device. Please excuse any typos...
> >
> > Mike Segel
> >
> > On May 3, 2012, at 5:46 AM, Austin Chungath <[EMAIL PROTECTED]> wrote:
> >
> >> Yeah I know :-)
> >> and this is not a production cluster ;-) and yes there is more hardware
> >> coming :-)
> >>
> >> On Thu, May 3, 2012 at 4:10 PM, Michel Segel <[EMAIL PROTECTED]
> >wrote:
> >>
> >>> Well, you've kind of painted yourself in to a corner...
> >>> Not sure why you didn't get a response from the Cloudera lists, but
> it's a
> >>> generic question...
> >>>
> >>> 8 out of 10 TB. Are you talking effective storage or actual disks?
> >>> And please tell me you've already ordered more hardware.. Right?
> >>>
> >>> And please tell me this isn't your production cluster...
> >>>
> >>> (Strong hint to Strata and Cloudea... You really want to accept my
> >>> upcoming proposal talk... ;-)
> >>>
> >>>
> >>> Sent from a remote device. Please excuse any typos...
> >>>
> >>> Mike Segel
> >>>
> >>> On May 3, 2012, at 5:25 AM, Austin Chungath <[EMAIL PROTECTED]>
> wrote:
> >>>
> >>>> Yes. This was first posted on the cloudera mailing list. There were no
> >>>> responses.
> >>>>
> >>>> But this is not related to cloudera as such.
> >>>>
> >>>> cdh3 is based on apache hadoop 0.20 as the base. My data is in apache
> >>>> hadoop 0.20.205
> >>>>
> >>>> There is an upgrade namenode option when we are migrating to a higher
> >>>> version say from 0.20 to 0.20.205
> >>>> but here I am downgrading from 0.20.205 to 0.20 (cdh3)
> >>>> Is this possible?
> >>>>
> >>>>
> >>>> On Thu, May 3, 2012 at 3:25 PM, Prashant Kommireddi <
> [EMAIL PROTECTED]
> >>>> wrote:
> >>>>
> >>>>> Seems like a matter of upgrade. I am not a Cloudera user so would not
> >>> know
> >>>>> much, but you might find some help moving this to Cloudera mailing
> list.
> >>>>>
> >>>>> On Thu, May 3, 2012 at 2:51 AM, Austin Chungath <[EMAIL PROTECTED]>
> >>>>> wrote:
> >>>>>
> >>>>>> There is only one cluster. I am not copying between clusters.
> >>>>>>
> >>>>>> Say I have a cluster running apache 0.20.205 with 10 TB storage
> >>> capacity
> >>>>>> and has about 8 TB of data.
> >>>>>> Now how can I migrate the same cluster to use cdh3 and use that
> same 8
> >>> TB
> >>>>>> of data.
> >>>>>>
> >>>>>> I can't copy 8 TB of data using distcp because I have only 2 TB of
> free
> >>>>>> space
> >>>>>>
> >>>>>>
> >>>>>> On Thu, May 3, 2012 at 3:12 PM, Nitin Pawar <
> [EMAIL PROTECTED]>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> you can actually look at the distcp
+
Michel Segel 2012-05-03, 23:00
+
Austin Chungath 2012-05-07, 10:27
+
Austin Chungath 2012-05-07, 11:14
+
Nitin Pawar 2012-05-07, 11:29
+
Adam Faris 2012-05-07, 14:37
+
Austin Chungath 2012-05-08, 05:55
+
Adam Faris 2012-05-08, 18:22
+
Austin Chungath 2012-05-09, 11:25