Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - can't disable speculative execution?


Copy link to this message
-
Re: can't disable speculative execution?
Harsh J 2012-07-12, 05:14
Try passing mapred.map.tasks = 0 or set a higher min-split size?

On Thu, Jul 12, 2012 at 10:36 AM, Yang <[EMAIL PROTECTED]> wrote:
> Thanks Harsh
>
> I see
>
> then there seems to be some small problems with the Splitter / InputFormat.
>
> I'm just reading a 1-line text file through pig:
>
> A = LOAD 'myinput.txt' ;
>
> supposedly it should generate at most 1 mapper.
>
> but in reality , it seems that pig generated 3 mappers, and basically fed
> empty input to 2 of the mappers
>
>
> Thanks
> Yang
>
> On Wed, Jul 11, 2012 at 10:00 PM, Harsh J <[EMAIL PROTECTED]> wrote:
>
>> Yang,
>>
>> No, those three are individual task attempts.
>>
>> This is how you may generally dissect an attempt ID when reading it:
>>
>> attempt_201207111710_0024_m_000000_0
>>
>> 1. "attempt" - indicates its an attempt ID you'll be reading
>> 2. "201207111710" - The job tracker timestamp ID, indicating which
>> instance of JT ran this job
>> 3. "0024" - The Job ID for which this was a task attempt
>> 4. "m" - Indicating this is a mapper (reducers are "r")
>> 5. "000000" - The task ID of the mapper (00000 is the first mapper,
>> 00001 is the second, etc.)
>> 6. "0" - The attempt # for the task ID. 0 means it is the first
>> attempt, 1 indicates the second attempt, etc.
>>
>> On Thu, Jul 12, 2012 at 9:16 AM, Yang <[EMAIL PROTECTED]> wrote:
>> > I set the following params to be false in my pig script (0.10.0)
>> >
>> > SET mapred.map.tasks.speculative.execution false;
>> > SET mapred.reduce.tasks.speculative.execution false;
>> >
>> >
>> > I also verified in the jobtracker UI in the job.xml that they are indeed
>> > set correctly.
>> >
>> > when the job finished, jobtracker UI shows that there is only one attempt
>> > for each task (in fact I have only 1 task too).
>> >
>> > but when I went to the tasktracker node, looked under the
>> > /var/log/hadoop/userlogs/job_id_here/
>> > dir , there are 3 attempts dir ,
>> >  job_201207111710_0024 # ls
>> > attempt_201207111710_0024_m_000000_0
>>  attempt_201207111710_0024_m_000001_0
>> >  attempt_201207111710_0024_m_000002_0  job-acls.xml
>> >
>> > so 3 attempts were indeed fired ??
>> >
>> > I have to get this controlled correctly because I'm trying to debug the
>> > mappers through eclipse,
>> > but if more than 1 mapper process is fired, they all try to connect to
>> the
>> > same debugger port, and the end result is that nobody is able to
>> > hook to the debugger.
>> >
>> >
>> > Thanks
>> > Yang
>>
>>
>>
>> --
>> Harsh J
>>

--
Harsh J