Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Which version to choose


Copy link to this message
-
Re: Which version to choose
I am using hadoop 0.20.2 for data analysis for my company. I did not upgrade
to hadoop 0.21 since the note in
http://hadoop.apache.org/common/releases.html#23+August%2C+2010%3A+release+0.21.0+available

On Wed, Dec 22, 2010 at 7:39 PM, Eric <[EMAIL PROTECTED]> wrote:

> This question may have been asked numerous times, and the answer will
> probably come down to the specific situation you are in, but I'm going to
> ask anyway:
>
> Which Hadoop version should I pick?
>
> I'm currently running Cloudera's CDH3 beta release, but I'm very tempted to
> install the latest Apache 0.21 version instead.
>
> Problems I encountered are:
> * Cloudera's distribution has bugs, like pid file directories that
> disappear after a reboot (because it's a memory disk).
> * I'm writing code against deprecated libraries :-( The new libraries are
> not yet complete in release 0.20.x.
>
> I'm not (yet) running a production cluster, but I'm planning on turning it
> into a production cluster in a few months. I do not feel confortable writing
> code against deprecated libraries, but I also don't feel confortable
> installing a Hadoop release that is not well tested and declared stable. If
> I am experimenting now so changes are that 0.21 will become stable over the
> coming months and will be a stable release once I go into production.
>
> If I may ask, what are you running? I can imagine large companies are not
> running the lastest version of Hadoop and/or HBase. Or am I wrong? Are you
> guys patching old releases or are you keeping up with new releases instead?
> Are there advantages to running Cloudera's packages instead of the Apache
> releases (besides that it is slightly easier to install)?
>
> Thank you in advance. All comments and suggestions are welcome!
> --
> Eric
>

--
Jingguo