Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - failed to report status for 601 seconds


Copy link to this message
-
Re: failed to report status for 601 seconds
Corbin Hoenes 2010-05-13, 23:07
Zaki,

can pig take command lines like this to set job conf properties?

pig -Dmapred.task.timeout=0

On May 13, 2010, at 4:18 PM, zaki rahaman wrote:

> Hi Corbin,
>
> The timeout error you're seeing could also indicate that your reducer is
> trying to process a very large key/group which may be the reason for the
> timeout in the first place. At least this is a behavior I've seen in the
> past.
>
> On Thu, May 13, 2010 at 6:10 PM, Corbin Hoenes <[EMAIL PROTECTED]> wrote:
>
>> Okay so what is the pig way to do this?
>>
>> Noticed a lot of chatter about UDFs in pig don't call progress and can
>> cause your jobs to get killed.  I am using only builtin UDFs like COUNT,
>> FLATTEN do they suffer from this same issue (no progress calls?)
>>
>> On May 12, 2010, at 2:56 AM, Andrey Stepachev wrote:
>>
>>> You should report progress in a period less then configured (in you case
>>> 600sec).
>>> Add code like below to you reducer and call ping in you reducer where you
>>> process tuples.
>>>
>>>       final TaskAttemptContext context = <init in costructor>;
>>>       long lastTime = System.currentTimeMillis();
>>>
>>>       public void ping() {
>>>           final long currtime = System.currentTimeMillis();
>>>           if (currtime - lastTime > 10000) {
>>>               context.progress();
>>>               lastTime = currtime;
>>>           }
>>>       }
>>>
>>>
>>> 2010/5/11 Corbin Hoenes <[EMAIL PROTECTED]>
>>>
>>>> Not sure I am clean on how I can debug stuff on a cluster.  I currently
>>>> have a long running reducer that attempts to run 4 times before finally
>>>> giving up
>>>>
>>>> I get 4 of these: Task attempt_201005101345_0052_r_000012_0 failed to
>>>> report status for 601 seconds. Killing!
>>>>
>>>> before it gives up...on the last try I noticed this in the log:
>>>> ERROR: org.apache.hadoop.hdfs.DFSClient - Exception closing file
>>>>
>> /tmp/temp1925356068/tmp1003826561/_temporary/_attempt_201005101345_0052_r_000012_4/abs/tmp/temp1925356068/tmp-197182389/part-00012
>>>> : org.apache.hadoop.ipc.RemoteException: java.io.IOException: Could not
>>>> complete write to file
>>>>
>> /tmp/temp1925356068/tmp1003826561/_temporary/_attempt_201005101345_0052_r_000012_4/abs/tmp/temp1925356068/tmp-197182389/part-00012
>>>> by DFSClient_attempt_201005101345_0052_r_000012_4
>>>>      at
>>>>
>> org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:497)
>>>>      at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source)
>>>>      at
>>>>
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>      at java.lang.reflect.Method.invoke(Method.java:597)
>>>>      at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
>>>>      at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966)
>>>>      at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962)
>>>>      at java.security.AccessController.doPrivileged(Native Method)
>>>>      at javax.security.auth.Subject.doAs(Subject.java:396)
>>>>      at org.apache.hadoop.ipc.Server$Handler.run(Server.java:960)
>>>> How do I turn on log4j's DEBUG statements?  Hoping those will help me
>>>> pinpoint what is going on here--maybe it's the cluster or maybe the
>> script.
>>>>
>>>>
>>>>
>>
>>
>
>
> --
> Zaki Rahaman