Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Best practice to migrate HDFS from 0.20.205 to CDH3u3


Copy link to this message
-
Re: Best practice to migrate HDFS from 0.20.205 to CDH3u3
This probably is a more relevant question in CDH mailing lists. That said,
what Edward is suggesting seems reasonable. Reduce replication factor,
decommission some of the nodes and create a new cluster with those nodes
and do distcp.

Could you share with us the reasons you want to migrate from Apache 205?

Regards,
Suresh

On Thu, May 3, 2012 at 8:25 AM, Edward Capriolo <[EMAIL PROTECTED]>wrote:

> Honestly that is a hassle, going from 205 to cdh3u3 is probably more
> or a cross-grade then an upgrade or downgrade. I would just stick it
> out. But yes like Michael said two clusters on the same gear and
> distcp. If you are using RF=3 you could also lower your replication to
> rf=2 'hadoop dfs -setrepl 2' to clear headroom as you are moving
> stuff.
>
>
> On Thu, May 3, 2012 at 7:25 AM, Michel Segel <[EMAIL PROTECTED]>
> wrote:
> > Ok... When you get your new hardware...
> >
> > Set up one server as your new NN, JT, SN.
> > Set up the others as a DN.
> > (Cloudera CDH3u3)
> >
> > On your existing cluster...
> > Remove your old log files, temp files on HDFS anything you don't need.
> > This should give you some more space.
> > Start copying some of the directories/files to the new cluster.
> > As you gain space, decommission a node, rebalance, add node to new
> cluster...
> >
> > It's a slow process.
> >
> > Should I remind you to make sure you up you bandwidth setting, and to
> clean up the hdfs directories when you repurpose the nodes?
> >
> > Does this make sense?
> >
> > Sent from a remote device. Please excuse any typos...
> >
> > Mike Segel
> >
> > On May 3, 2012, at 5:46 AM, Austin Chungath <[EMAIL PROTECTED]> wrote:
> >
> >> Yeah I know :-)
> >> and this is not a production cluster ;-) and yes there is more hardware
> >> coming :-)
> >>
> >> On Thu, May 3, 2012 at 4:10 PM, Michel Segel <[EMAIL PROTECTED]
> >wrote:
> >>
> >>> Well, you've kind of painted yourself in to a corner...
> >>> Not sure why you didn't get a response from the Cloudera lists, but
> it's a
> >>> generic question...
> >>>
> >>> 8 out of 10 TB. Are you talking effective storage or actual disks?
> >>> And please tell me you've already ordered more hardware.. Right?
> >>>
> >>> And please tell me this isn't your production cluster...
> >>>
> >>> (Strong hint to Strata and Cloudea... You really want to accept my
> >>> upcoming proposal talk... ;-)
> >>>
> >>>
> >>> Sent from a remote device. Please excuse any typos...
> >>>
> >>> Mike Segel
> >>>
> >>> On May 3, 2012, at 5:25 AM, Austin Chungath <[EMAIL PROTECTED]>
> wrote:
> >>>
> >>>> Yes. This was first posted on the cloudera mailing list. There were no
> >>>> responses.
> >>>>
> >>>> But this is not related to cloudera as such.
> >>>>
> >>>> cdh3 is based on apache hadoop 0.20 as the base. My data is in apache
> >>>> hadoop 0.20.205
> >>>>
> >>>> There is an upgrade namenode option when we are migrating to a higher
> >>>> version say from 0.20 to 0.20.205
> >>>> but here I am downgrading from 0.20.205 to 0.20 (cdh3)
> >>>> Is this possible?
> >>>>
> >>>>
> >>>> On Thu, May 3, 2012 at 3:25 PM, Prashant Kommireddi <
> [EMAIL PROTECTED]
> >>>> wrote:
> >>>>
> >>>>> Seems like a matter of upgrade. I am not a Cloudera user so would not
> >>> know
> >>>>> much, but you might find some help moving this to Cloudera mailing
> list.
> >>>>>
> >>>>> On Thu, May 3, 2012 at 2:51 AM, Austin Chungath <[EMAIL PROTECTED]>
> >>>>> wrote:
> >>>>>
> >>>>>> There is only one cluster. I am not copying between clusters.
> >>>>>>
> >>>>>> Say I have a cluster running apache 0.20.205 with 10 TB storage
> >>> capacity
> >>>>>> and has about 8 TB of data.
> >>>>>> Now how can I migrate the same cluster to use cdh3 and use that
> same 8
> >>> TB
> >>>>>> of data.
> >>>>>>
> >>>>>> I can't copy 8 TB of data using distcp because I have only 2 TB of
> free
> >>>>>> space
> >>>>>>
> >>>>>>
> >>>>>> On Thu, May 3, 2012 at 3:12 PM, Nitin Pawar <
> [EMAIL PROTECTED]>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> you can actually look at the distcp
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB