Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - failed to report status for 601 seconds


+
Corbin Hoenes 2010-05-11, 18:06
+
Andrey Stepachev 2010-05-12, 08:56
+
Corbin Hoenes 2010-05-13, 22:10
+
zaki rahaman 2010-05-13, 22:18
+
Corbin Hoenes 2010-05-13, 23:07
Copy link to this message
-
Re: failed to report status for 601 seconds
Ashutosh Chauhan 2010-05-14, 00:39
Yes, you can set job conf properties like that, but looking at your
original stack trace.. it seems that wont solve your problem. Looks
like your cluster is in trouble. If you can, try restarting it. If
not,  you need to dig into tasktracker / datanode logs.

Ashutosh

On Thu, May 13, 2010 at 16:07, Corbin Hoenes <[EMAIL PROTECTED]> wrote:
> Zaki,
>
> can pig take command lines like this to set job conf properties?
>
> pig -Dmapred.task.timeout=0
>
> On May 13, 2010, at 4:18 PM, zaki rahaman wrote:
>
>> Hi Corbin,
>>
>> The timeout error you're seeing could also indicate that your reducer is
>> trying to process a very large key/group which may be the reason for the
>> timeout in the first place. At least this is a behavior I've seen in the
>> past.
>>
>> On Thu, May 13, 2010 at 6:10 PM, Corbin Hoenes <[EMAIL PROTECTED]> wrote:
>>
>>> Okay so what is the pig way to do this?
>>>
>>> Noticed a lot of chatter about UDFs in pig don't call progress and can
>>> cause your jobs to get killed.  I am using only builtin UDFs like COUNT,
>>> FLATTEN do they suffer from this same issue (no progress calls?)
>>>
>>> On May 12, 2010, at 2:56 AM, Andrey Stepachev wrote:
>>>
>>>> You should report progress in a period less then configured (in you case
>>>> 600sec).
>>>> Add code like below to you reducer and call ping in you reducer where you
>>>> process tuples.
>>>>
>>>>       final TaskAttemptContext context = <init in costructor>;
>>>>       long lastTime = System.currentTimeMillis();
>>>>
>>>>       public void ping() {
>>>>           final long currtime = System.currentTimeMillis();
>>>>           if (currtime - lastTime > 10000) {
>>>>               context.progress();
>>>>               lastTime = currtime;
>>>>           }
>>>>       }
>>>>
>>>>
>>>> 2010/5/11 Corbin Hoenes <[EMAIL PROTECTED]>
>>>>
>>>>> Not sure I am clean on how I can debug stuff on a cluster.  I currently
>>>>> have a long running reducer that attempts to run 4 times before finally
>>>>> giving up
>>>>>
>>>>> I get 4 of these: Task attempt_201005101345_0052_r_000012_0 failed to
>>>>> report status for 601 seconds. Killing!
>>>>>
>>>>> before it gives up...on the last try I noticed this in the log:
>>>>> ERROR: org.apache.hadoop.hdfs.DFSClient - Exception closing file
>>>>>
>>> /tmp/temp1925356068/tmp1003826561/_temporary/_attempt_201005101345_0052_r_000012_4/abs/tmp/temp1925356068/tmp-197182389/part-00012
>>>>> : org.apache.hadoop.ipc.RemoteException: java.io.IOException: Could not
>>>>> complete write to file
>>>>>
>>> /tmp/temp1925356068/tmp1003826561/_temporary/_attempt_201005101345_0052_r_000012_4/abs/tmp/temp1925356068/tmp-197182389/part-00012
>>>>> by DFSClient_attempt_201005101345_0052_r_000012_4
>>>>>      at
>>>>>
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:497)
>>>>>      at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source)
>>>>>      at
>>>>>
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>      at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>      at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
>>>>>      at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966)
>>>>>      at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962)
>>>>>      at java.security.AccessController.doPrivileged(Native Method)
>>>>>      at javax.security.auth.Subject.doAs(Subject.java:396)
>>>>>      at org.apache.hadoop.ipc.Server$Handler.run(Server.java:960)
>>>>> How do I turn on log4j's DEBUG statements?  Hoping those will help me
>>>>> pinpoint what is going on here--maybe it's the cluster or maybe the
>>> script.
>>>>>
>>>>>
>>>>>
>>>
>>>
>>
>>
>> --
>> Zaki Rahaman
>
>
+
Corbin Hoenes 2010-05-14, 05:40