Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Re: Getting started recommendations


Copy link to this message
-
Re: Getting started recommendations
Nitin Pawar 2013-01-11, 11:30
http://my.safaribooksonline.com/book/databases/hadoop/9780596521974

I loved this book. very well defined
On Fri, Jan 11, 2013 at 3:22 AM, Michael Forage <
[EMAIL PROTECTED]> wrote:

>  I am still new but had similar questions and went through a lot of pain
> getting started****
>
> ** **
>
> If you want to get programming rather than spend time learning how to
> install, configure and administer the Hadoop tools I recommend using Amazon
> Elastic MapReduce.****
>
> This will very quickly get you to a stage where you are able to submit and
> run mapreduce jobs (and pig, hive, etc…)****
>
> ** **
>
> It’s a very cheap option for learning the platform, especially if you use
> the Ruby command-line tool which allows you to re-use your Hadoop instances
> for multiple jobs rather than the more expensive default of starting and
> stopping new clusters each time. It’s got some pretty decent tutorials
> although (as with everything hadoop it seems) the area is so large that
> inevitably you’ll be googling some things or asking questions here****
>
> ** **
>
> Also, I found the book “Hadoop in Action” very readable and informative,
> even as someone who has only sporadically used Java throughout my career.
> This actually takes you through different use cases based on test data
> downloadable from the web. Only issue is that it’s written based on the
> older (though fully supported Hadoop 0.20) API and since it’s written for
> someone with a local Hadoop cluster you have a small effort to translate to
> the Amazon EMR way of doing things. Still very useful though ****
>
> ** **
>
> Cheers****
>
> Mike****
>
> ** **
>
> *From:* John Lilley [mailto:[EMAIL PROTECTED]]
> *Sent:* 11 January 2013 10:29
> *To:* [EMAIL PROTECTED]
> *Subject:* Getting started recommendations****
>
> ** **
>
> We are somewhat new to Hadoop and are looking to run some experiments with
> HDFS, Pig, and HBase.  ****
>
> With that in mind, I have a few questions:****
>
> What is the easiest (preferably free) Hadoop distro to get started with?
> Cloudera?****
>
> What host OS distro/release is recommended?****
>
> What is the easiest environment to get started with?  Amazon EC2?  Is
> there anyone offering virtual/hosted prebuilt Hadoop instances?****
>
> Where would we find some “big data” files that people have used for
> testing purposes?****
>
> Feel free to RTFM me to the right place ;-)****
>
> Thanks, john****
>
> ** **
>

--
Nitin Pawar