Hi Pravin,
Studying Hadoop or MapReduce can look a daunting task if you get your hand dirty at the start.
Some of the prerequisites for learning Hadoop are having a good experience in Java. Good Analytical skills help a lot as well and final secret sauce for being successful is – you need to be motivated to self learn lot of things in the bigdata arena.
I followed the schedule as follows :
Start with very basics of MR with
http://code.google.com/edu/parallel/dsd-tutorial.html http://code.google.com/edu/parallel/mapreduce-tutorial.html Then go for the first two lectures in
http://www.cs.washington.edu/education/courses/cse490h/08au/lectures.htm A very good course intro to MapReduce and Hadoop.
Read the seminal paper
http://labs.google.com/papers/mapreduce.html and its improvements in the updated version
http://www.cs.washington.edu/education/courses/cse490h/08au/readings/communications200801-dl.pdfThen go for all the other videos in the U.Washington link given above. (For more details into Distributed Systems
Try youtubing the terms Map reduce and hadoop to find videos by O'Rielly and Google RoundTable for good overview of the future of Hadoop and MapReduce
Then off to the most important videos -
Cloudera Videos
http://www.cloudera.com/resources/?media=Videoand
Google MiniLecture Series
http://code.google.com/edu/submissions/mapreduce-minilecture/listing.htmlAlong with all the Multimedia above we need good written material
Documents:
Architecture diagrams at
http://hadooper.blogspot.com are good to have on your wall
Hadoop: The definitive guide goes more into the nuts and bolts of the whole system where as
Hadoop in Action is a good read with lots of teaching examples to learn the concepts of hadoop.
Pro Hadoop is good for more advanced stuff such as chaining and Spring-Hadoop
PDFs of the documentation from Apache Foundation
http://hadoop.apache.org/common/docs/current/ and
http://hadoop.apache.org/common/docs/stable/will help you learn as to how model your problem into a MR solution in order to gain the advantages of Hadoop in total.
HDFS paper by Yahoo! Research is also a good read in order to gain in depth knowledge of hadoop (ACM:
http://dl.acm.org/citation.cfm?id=1914427 and DL at
http://storageconference.org/2010/Papers/MSST/Shvachko.pdf Try the
http://developer.yahoo.com/hadoop/tutorial/module1.html link for beginners to expert path to Hadoop (Warning Hadoop 0.19 Version used)
Imp for Setting up Hadoop: Here is a more recent tutorial on setting up Hadoop:
http://orzota.com/blog/single-node-hadoop-setup-2/And here is one on configuring Eclipse for hadoop development:
http://orzota.com/blog/eclipse-setup-for-hadoop-development/For Any Queries ...
Contact Apache, Google, Bing, Yahoo!
Thanks,
Varad
On 23-Aug-2012, at 10:19 PM, [EMAIL PROTECTED] wrote:
> Then perhaps try downloading Cloudera tarballs and run some jobs in pseudo distributed mode in your local linux.
> Using amazon ec2 machines to configure a small cluster will also be a nice experiment.
>
> Emmanuel
>
> 2012/8/23 Mohit Anchlia <[EMAIL PROTECTED]>
> start with reading map reduce paper and then look at hadoop book
>
> On Thu, Aug 23, 2012 at 9:19 AM, Pravin Sinha <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I am new to Hadoop. What would be the best way to learn hadoop and eco system around it?
>
> Thanks,
> Pravin
>
>
>
>
>
> --
> Emmanuel de Castro Santana