Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> hadoop knowledge gaining


Copy link to this message
-
Re: hadoop knowledge gaining
Jignesh
        Don't worry on the old API at this point. Start off with the samples try understanding and implementing them. Once you have a firm grip with map reduce porting your code from old API to new one would just take a few mins. If you are starting on Map reduce I'd suggest you to take the following steps
-understand hadoop, hdfs and mapreduce
-understand the word count example code you already ran
-understand each and every parameters mentioned in the Driver Class along with Map and Reduce class, get a grip on the parameters used in map and reduce methods and why.
-understand the anotomy and internal working flow of end to end map reduce job.
-look at various types of input and output formats other than default TextInputFormat ans where they are to be used
-Look at Writables and usage

With this you should be good enough to code basic map reduce. Get used to the common exceptions and its work arounds.  After you get comfortable with this try other advance concepts like
-Distributed Cache
-Joins
-Counter
-Reporter
-cluster admin commands
-fine tune map reduce jobs etc

You would learn these once you do real time map reduce programming.

If you are planing to use Hbase in Map reduce. Better understanding MR first and then moving on to HBase.

By the way, Hadoop Definite Guide by Tom White is also an awesome book.
Also to play around you'll get some simple samples at
http://kickstarthadoop.blogspot.com

Hope it helps!.

------Original Message------
From: Jignesh Patel
To: [EMAIL PROTECTED]
ReplyTo: [EMAIL PROTECTED]
Subject: hadoop knowledge gaining
Sent: Oct 7, 2011 19:55

Guys,
I am able to deploy the first program word count using hadoop. I am interesting exploring more about hadoop and Hbase and don't know which is the best way to grasp both of them.

I have hadoop in action but it has older api. I do also have Hbase definitive guide which I have not started exploring.

-Jignesh
Regards
Bejoy K S
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB