Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - Re: CDH412/Hadoop 2.0.3 Upgrade instructions


Copy link to this message
-
Re: CDH412/Hadoop 2.0.3 Upgrade instructions
Harsh J 2013-01-22, 17:38
Moving to [EMAIL PROTECTED] as your question is CDH related.

My answers inline:

On Tue, Jan 22, 2013 at 4:35 AM, Dheeren bebortha
<[EMAIL PROTECTED]>wrote:

> I am trying to upgrade a Hadoop Cluster with 0.20.X and MRv1 to a hadoop
> Cluster with CDH412 with HA+QJM+YARN (aka Hadoop 2.0.3) without any  data
> loss and minimal down time. The documentation on cloudera site iis OK, but
> very confusing. BTW I do not plan on using Cloudera manager. Has anyone
> attempted a clean upgrade using hadoop native commands?
>

The upgrade process for any 0.20/1.x/CDH3 release to CDH4 is documented at
https://ccp.cloudera.com/display/CDH4DOC/Upgrading+from+CDH3+to+CDH4. The
only difference you may see is in use of packaging (tarballs or RPMs/DEBs?)
and therefore, of usernames used in the guide.

The basic process is to stop the older HDFS, remove older installation and
all its traces carefully, and start the newer HDFS with the -upgrade flag.
This takes care of HDFS metadata upgrades. Once done and you've verified
that files/etc. are all perfectly readable and state's good, you can
dfsadmin -finalizeUpgrade your cluster to commit the upgrade permanently.
QJM is documented in a separate guide, found on the same portal mentioned
above and can be upgraded in a second step after upgrade, to achieve full
HA.

For MR side, all your MR1 jobs will need to be recompiled before they may
be run on the newer YARN+MR2 cluster due to some binary incompatible
changes made between the versions you're upgrading. Other than a recompile,
you may mostly not require to do anything else.

May we also know your reason to not use CM when its aimed to make all this
much easier to do and manage? We appreciate any form of feedback, thanks!

--
Harsh J