Marcelo Elias Del Valle 2013-01-28, 15:54
Harsh J 2013-01-28, 16:02
First of all, thanks for the answer!
2013/1/28 Harsh J <[EMAIL PROTECTED]>
> So depending on your implementation of the job here, you may or may
> not see it act in effect. Hope this helps.
Is there anything I can do in my job, my code or in my inputFormat so that
hadoop would choose to run more mappers? My text file and 10 million lines
and each mapper task process 1 line at a time, very fastly. I would like to
have 40 threads in parallel or even more processing those lines.
> > When I run my job with just 1 instance, I see it only creates 1
> > When I run my job with 5 instances (1 master and 4 cores), I can see
> only 2
> > mapper slots are used and 6 stay open.
> Perhaps the job itself launched with 2 total map tasks? You can check
> this on the JobTracker UI or whatever EMR offers as a job viewer.
I am trying to figure this out. Here is what I have from EMR:
I will try to get their support to understand this, but I didn't understand
what you said about the job being launched with 2 total map tasks... if I
have 8 slots, shouldn't all of them be filled always?
> This is a typical waiting reduce task log, what are you asking here
I have no reduce tasks. My map does the job without putting anything in the
output. Is it happening because reduce tasks receive nothing as input?
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr
Harsh J 2013-01-28, 16:41
Marcelo Elias Del Valle 2013-01-28, 16:55
Marcelo Elias Del Valle 2013-01-28, 20:56
Vinod Kumar Vavilapalli 2013-01-29, 02:08
Marcelo Elias Del Valle 2013-01-29, 10:52
Marcelo Elias Del Valle 2013-01-29, 12:53