Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Task Attempt failed to report status..Killing !!


Copy link to this message
-
Re: Task Attempt failed to report status..Killing !!
Hi,

Sounds like your mappers are overloaded. Can you try the following?

1. You can set mapred.max.split.size to a smaller value, so more mappers
can be launched.

or

2. You can set mapred.task.timeout to a larger value. The default value is
600 seconds.

Thanks,
Cheolsoo

On Mon, May 13, 2013 at 8:03 PM, Praveen Bysani <[EMAIL PROTECTED]>wrote:

> Hi,
>
> I have a very weird issue with my PIG script. Following is the content of
> my script
>
> *REGISTER /home/hadoopuser/Workspace/lib/piggybank.jar*
> *REGISTER /home/hadoopuser/Workspace/lib/datafu.jar;*
> *REGISTER
> /opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hbase/hbase-0.94.2-cdh4.2.1-security.jar;
> *
> *REGISTER
> /opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/zookeeper/zookeeper-3.4.5-cdh4.2.1.jar;
> *
> *SET default_parallel 15;*
>
> *records = LOAD 'hbase://dm-re' USING
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('v:ctm v:src','-caching
> 5000 -gt 1366098805& -lt 1366102543&') as
> (time:chararray,company:chararray);*
>
> *records_iso = FOREACH records GENERATE
> org.apache.pig.piggybank.evaluation.datetime.convert.CustomFormatToISO(time,'yyyy-MM-dd
> HH:mm:ss Z') as iso_time;*
> *records_group = GROUP records_iso ALL;*
> *result = FOREACH records_group GENERATE MAX(records_iso.iso_time) as
> maxtime;*
> *DUMP result*
>
> When i try to run this script in cluster of 5 nodes with 20 map slots,
> most of the map tasks fail with the following error after 10 mins of
> initializing,
> *Task attempt <id> failed to report status for 600 seconds. Killing!*
>
> I tried to decrease the caching size to less than 100 or so, (under the
> intuition that maybe fetching and processing more cache is taking more
> time) but still the same issue. However if i manage to load the rows (using
> lt and gt) such that number of map tasks are <=2, the job will be
> successfully finished. When the number of tasks is > 2 , it is always the
> case that 2-4 tasks are completed and the rest all fail with the above
> mentioned error. I attach the task tracker log hereby for this attempt. I
> don't see any error except for some zookeeper connection warnings. I
> manually checked from that node and doing a 'hbase zkcli' connects without
> any issue. Hence, I assume that zookeeper is configured properly.
>
> I don't really understand where to debug this problem. It would be great
> if someone could provide assistance. Some configurations of the cluster,
> which i think maybe relevant here,
> *dfs.block.size = 1 GB
> io.sort.mb = 1 GB
> HRegion size = 1 GB
>
> *
> and the size of the hbase table is close to 250 GB. I have observed 100%
> cpu usage by the mapred user on the node, while the task is under
> execution. I am not really sure, what to optimize in this case for the job
> to complete. It would be good if someone can throw some light in this
> direction.
>
> PS: All my nodes in the cluster are configured on a EBS backed amazon ec2
> cluster.
>
>
> --
> Regards,
> Praveen Bysani
> http://www.praveenbysani.com
>