Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Re: Getting started recommendations


Copy link to this message
-
Re: Getting started recommendations
Hi,

Warning, I am a newby myself. Please find my answer inline.

Good luck
Olivier
On 11 January 2013 10:29, John Lilley <[EMAIL PROTECTED]> wrote:

>  We are somewhat new to Hadoop and are looking to run some experiments
> with HDFS, Pig, and HBase.  ****
>
> With that in mind, I have a few questions:****
>
> What is the easiest (preferably free) Hadoop distro to get started with?
> Cloudera?
>
Cloudera is probably easy. I've gone with the solution from Hortonworks.
I've used their hmc ( Hortonworks Management Console ). It's a webui which
installed all the components you desired on your behalf as well as
installing monitoring ( ganglia + nagios ). HMC is based on Ambari ( apache
project ). You can find some information on how to install it at :
http://hortonworks.com/hdp11-hmc-quick-start-guide/

> ****
>
> What host OS distro/release is recommended?
>
CentOS6 / RHEL6 seems to be a good solution.
> ****
>
> What is the easiest environment to get started with?  Amazon EC2?  Is
> there anyone offering virtual/hosted prebuilt Hadoop instances?
>
I've installed it on EC2. It worked like a charm
> ****
>
> Where would we find some “big data” files that people have used for
> testing purposes?
>
As part of the documentation, there is a map reduce tutorial. You can then
use any files and use the wordcount examples.
http://hadoop.apache.org/docs/r0.20.2/mapred_tutorial.html

> ****
>
> Feel free to RTFM me to the right place ;-)****
>
> Thanks, john****
>
> ** **
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB