Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Re: Getting started recommendations


Copy link to this message
-
Re: Getting started recommendations
Olivier Renault 2013-01-11, 10:43
Hi,

Warning, I am a newby myself. Please find my answer inline.

Good luck
Olivier
On 11 January 2013 10:29, John Lilley <[EMAIL PROTECTED]> wrote:

>  We are somewhat new to Hadoop and are looking to run some experiments
> with HDFS, Pig, and HBase.  ****
>
> With that in mind, I have a few questions:****
>
> What is the easiest (preferably free) Hadoop distro to get started with?
> Cloudera?
>
Cloudera is probably easy. I've gone with the solution from Hortonworks.
I've used their hmc ( Hortonworks Management Console ). It's a webui which
installed all the components you desired on your behalf as well as
installing monitoring ( ganglia + nagios ). HMC is based on Ambari ( apache
project ). You can find some information on how to install it at :
http://hortonworks.com/hdp11-hmc-quick-start-guide/

> ****
>
> What host OS distro/release is recommended?
>
CentOS6 / RHEL6 seems to be a good solution.
> ****
>
> What is the easiest environment to get started with?  Amazon EC2?  Is
> there anyone offering virtual/hosted prebuilt Hadoop instances?
>
I've installed it on EC2. It worked like a charm
> ****
>
> Where would we find some “big data” files that people have used for
> testing purposes?
>
As part of the documentation, there is a map reduce tutorial. You can then
use any files and use the wordcount examples.
http://hadoop.apache.org/docs/r0.20.2/mapred_tutorial.html

> ****
>
> Feel free to RTFM me to the right place ;-)****
>
> Thanks, john****
>
> ** **
>