Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> conf.setMaxMapAttempts, SkipBadRecords, etc.


Copy link to this message
-
Re: conf.setMaxMapAttempts, SkipBadRecords, etc.
Agreed, and thank you for the clarification.  It almost sounds like this is a bug on the cluster side, like something I haven't done wrong in my own code or config files.

On Dec 25, 2010, at 4:55 AM, Sharad Agarwal wrote:

> From the description, it looks like you are unable to set the max map
> attempts to 1. This is completely different from Skip bad records feature.
> Skip bad records feature let you run the same task by SKIPPING the records;
> at which the last attempt failed.
>
> If you are fine with all input records not being processed for a failing
> mapper, then you don't need skip records feature. Just need to investigate
> why setMaxMapAttempts doesn't work for you.
>
>
> On Fri, Dec 24, 2010 at 3:02 AM, Keith Wiley <[EMAIL PROTECTED]> wrote:
>
>> Let's say I want to ditch an input record the very first time it fails
>> (because I know it is a deterministic data-dependent failure) instead of
>> retrying it the default four times.  I have already experimented with
>> conf.setMaxMapAttempts() with no success.  For example, consider the
>> following:
>>
>> int maxMapAttempts = conf.getMaxMapAttempts();
>> conf.setMaxMapAttempts(1);
>> int maxMapAttempts = conf.getMaxMapAttempts();
>>
>> Before calling conf.setMaxMapAttempts(1), getMaxMapAttempts() returns the
>> default, 4, and after calling conf.setMaxMapAttempts(1), it returns 1.
>> However, despite that encouraging feedback, it doesn't work.  The Hadoop
>> job still restarts each failed map task four times.  Furthermore, I have
>> confirmed that the job.xml file on the job tracker has the following:
>>
>> mapred.map.max.attempts = 4
>>
>> ...which proves it really didn't change mapred.map.max.attempts!  I also
>> added the following to my mapred-site.xml file:
>>
>> <property>
>>   <name>mapred.map.max.attempts</name>
>>   <value>1</value>
>>   <final>true</final>
>>   <description>Max map attempts.
>>   </description>
>> </property>
>>
>> When I do that, the initial call conf.getMaxMapAttempts() return 1, not 4,
>> just as expected...but nonetheless, the job.xml file on the job tracker
>> reports that the value has reverted to 4 once again.  I have sought a
>> solution to this problem for a long time and have decided that no one knows
>> how to fix it (if you have any ideas PLEASE let me know), so I'm moving on
>> to a different approach.  I am now trying the following:
>>
>> SkipBadRecords.setMapperMaxSkipRecords(conf, 1);
>> SkipBadRecords.setAttemptsToStartSkipping(conf, 1);
>>
>> First, can anyone confirm that this is the correct set of calls to make
>> SkipBadRecords skip a record after its first failure?
>>
>> Second, this doesn't work either!  My map tasks still restart four times.
>>
>> I'm really desperate on this and so far my research has turned up nothing.
>> I would greatly appreciate any help on this matter.
>>
>> Thank you.
>>
>>
>> ________________________________________________________________________________
>> Keith Wiley               [EMAIL PROTECTED]
>> www.keithwiley.com
>>
>> "Yet mark his perfect self-contentment, and hence learn his lesson, that to
>> be
>> self-contented is to be vile and ignorant, and that to aspire is better
>> than to
>> be blindly and impotently happy."
>> -- Edwin A. Abbott, Flatland
>>
>> ________________________________________________________________________________
>>
>>
>>
>>
________________________________________________________________________________
Keith Wiley     [EMAIL PROTECTED]     keithwiley.com    music.keithwiley.com

"I do not feel obliged to believe that the same God who has endowed us with
sense, reason, and intellect has intended us to forgo their use."
                                           --  Galileo Galilei
________________________________________________________________________________
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB