Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - RE: MRBench Maps strange behaviour


+
Leo Leung 2012-08-29, 17:11
+
Gaurav Dasgupta 2012-08-28, 10:32
+
Hemanth Yamijala 2012-08-29, 05:56
Copy link to this message
-
Re: MRBench Maps strange behaviour
Gaurav Dasgupta 2012-08-29, 07:44
Hi Hemanth,

Thanks for the reply.
Can you tell me how can I calculate or ensure from the counters what should
be the exact number of Maps?
Thanks,
Gaurav Dasgupta
On Wed, Aug 29, 2012 at 11:26 AM, Hemanth Yamijala <[EMAIL PROTECTED]>wrote:

> Hi,
>
> The number of maps specified to any map reduce program (including
> those part of MRBench) is generally only a hint, and the actual number
> of maps will be influenced in typical cases by the amount of data
> being processed. You can take a look at this wiki link to understand
> more: http://wiki.apache.org/hadoop/HowManyMapsAndReduces
>
> In the examples below, since the data you've generated is different,
> the number of mappers are different. To be able to judge your
> benchmark results, you'd need to benchmark against the same data (or
> at least same type of type - i.e. size and type).
>
> The number of maps printed at the end is straight from the input
> specified and doesn't reflect what the job actually ran with. The
> information from the counters is the right one.
>
> Thanks
> Hemanth
>
> On Tue, Aug 28, 2012 at 4:02 PM, Gaurav Dasgupta <[EMAIL PROTECTED]>
> wrote:
> > Hi All,
> >
> > I executed the "MRBench" program from "hadoop-test.jar" in my 12 node
> CDH3
> > cluster. After executing, I had some strange observations regarding the
> > number of Maps it ran.
> >
> > First I ran the command:
> > hadoop jar /usr/lib/hadoop-0.20/hadoop-test.jar mrbench -numRuns 3 -maps
> 200
> > -reduces 200 -inputLines 1024 -inputType random
> > And I could see that the actual number of Maps it ran was 201 (for all
> the 3
> > runs) instead of 200 (Though the end report displays the launched to be
> > 200). Here is the console report:
> >
> >
> > 12/08/28 04:34:35 INFO mapred.JobClient: Job complete:
> job_201208230144_0035
> >
> > 12/08/28 04:34:35 INFO mapred.JobClient: Counters: 28
> >
> > 12/08/28 04:34:35 INFO mapred.JobClient:   Job Counters
> >
> > 12/08/28 04:34:35 INFO mapred.JobClient:     Launched reduce tasks=200
> >
> > 12/08/28 04:34:35 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=617209
> >
> > 12/08/28 04:34:35 INFO mapred.JobClient:     Total time spent by all
> reduces
> > waiting after reserving slots (ms)=0
> >
> > 12/08/28 04:34:35 INFO mapred.JobClient:     Total time spent by all maps
> > waiting after reserving slots (ms)=0
> >
> > 12/08/28 04:34:35 INFO mapred.JobClient:     Rack-local map tasks=137
> >
> > 12/08/28 04:34:35 INFO mapred.JobClient:     Launched map tasks=201
> >
> > 12/08/28 04:34:35 INFO mapred.JobClient:     Data-local map tasks=64
> >
> > 12/08/28 04:34:35 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=1756882
> >
> >
> >
> > Again, I ran the MRBench for just 10 Maps and 10 Reduces:
> >
> > hadoop jar /usr/lib/hadoop-0.20/hadoop-test.jar mrbench -maps 10
> -reduces 10
> >
> >
> >
> > This time the actual number of Maps were only 2 and again the end report
> > displays Maps Lauched to be 10. The console output:
> >
> >
> >
> > 12/08/28 05:05:35 INFO mapred.JobClient: Job complete:
> job_201208230144_0040
> > 12/08/28 05:05:35 INFO mapred.JobClient: Counters: 27
> > 12/08/28 05:05:35 INFO mapred.JobClient:   Job Counters
> > 12/08/28 05:05:35 INFO mapred.JobClient:     Launched reduce tasks=20
> > 12/08/28 05:05:35 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=6648
> > 12/08/28 05:05:35 INFO mapred.JobClient:     Total time spent by all
> reduces
> > waiting after reserving slots (ms)=0
> > 12/08/28 05:05:35 INFO mapred.JobClient:     Total time spent by all maps
> > waiting after reserving slots (ms)=0
> > 12/08/28 05:05:35 INFO mapred.JobClient:     Launched map tasks=2
> > 12/08/28 05:05:35 INFO mapred.JobClient:     Data-local map tasks=2
> > 12/08/28 05:05:35 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=163257
> > 12/08/28 05:05:35 INFO mapred.JobClient:   FileSystemCounters
> > 12/08/28 05:05:35 INFO mapred.JobClient:     FILE_BYTES_READ=407
> > 12/08/28 05:05:35 INFO mapred.JobClient:     HDFS_BYTES_READ=258
> > 12/08/28 05:05:35 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=1072596
+
Hemanth Yamijala 2012-08-29, 08:31
+
Bejoy KS 2012-08-29, 07:50