Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> join operation fails on big data set


+
Mua Ban 2013-04-12, 15:18
Copy link to this message
-
Re: join operation fails on big data set
Did you look at task logs to see why those tasks failed? Since it's a
back-end error, the console output doesn't tell you much. Task logs should
have a stack trace that shows why it failed, and you can go from there.

On Fri, Apr 12, 2013 at 8:18 AM, Mua Ban <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I am very new to PIG/Hadoop, I just started writing my first PIG script a
> couple days ago. I ran into this problem.
>
> My cluster has 9 nodes. I have to join two data sets big and small, each is
> collected for 4 weeks. I first take two subsets of my data set (which is
> for the first week of data), let's call them B1 and S1 for big and small
> data sets of the first week. The entire data sets of 4 weeks is B4 and S4.
>
> I ran my script on my cluster to join B1 and S1 and everything is fine. I
> got my joined data. However, when I ran my script to join B4 and S4, the
> script failed. B4 is 39GB, S4 is 300MB. B4 is skewed, some id appears more
> frequently than others. I tried both 'using skewed' and 'using replicated'
> modes for the join operation (by appending them to the end of the below
> join clause), they both fail.
>
> Here is my script and i think it is very simple:
>
> *big = load 'bigdir/' using PigStorage(',') as (id:chararray,
> data:chararray);*
> *small = load 'smalldir/' using PigStorage(',') as
> (t1:double,t2:double,data:chararray,id:chararray);
> *
> *J = JOIN big by id LEFT OUTER, small by id;
> *
> *store J into 'outputdir' using PigStorage(',');
> *
>
> On the web ui of the tracker, I see that the job has 40 reducers (I guess
> since the total data is about 40GB, and each 1GB will need one reducer by
> default of PIG and hadoop setting, so this is normal). If I use 'parallel
> 80' in the join operation above, then I see 80 reducers, and the join
> operation still failed.
>
> I checked file  mapred-default.xml and found this:
> <name>mapred.reduce.tasks</name>
>   <value>1</value>
>
> If I set the value of parallel in join operation, it should overwrite this,
> right?
>
>
> On the tracker GUI, I see that for different runs, the number of completed
> reducers changes from 4 to 10 (out of 40 total reducers). The tracker GUI
> shows the reason for the failed reducers: "Task
> attempt_201304081613_0046_r_000006_0 failed to report status for 600
> seconds. Killing!"
>
> *Could you please help?*
> Thank you very much,
> -Mua
>
>
> --------------------------------------------------------------------------------------------------------------
> Here is the error report from the console screen where I ran this script:
>
> job_201304081613_0032   616     0       230     12      32      0   0
> 0       big     MAP_ONLY
> job_201304081613_0033   705     1       21      6       6       234 2
> 34      234             SAMPLER
>
> Failed Jobs:
> JobId   Alias   Feature Message Outputs
> job_201304081613_0034   small   SKEWED_JOIN     Message: Job failed!
> Error - # of failed Reduce Tasks exceeded allowed limit. FailedCount: 1.
> LastFailedTask: task_201304081613_0034_r_000012
>
> Input(s):
> Successfully read 364285458 records (39528533645 bytes) from:
> "hdfs://d0521b01:24990/user/abc/big/"
> Failed to read data from "hdfs://d0521b01:24990/user/abc/small/"
>
> Output(s):
>
> Counters:
> Total records written : 0
> Total bytes written : 0
> Spillable Memory Manager spill count : 0
> Total bags proactively spilled: 0
> Total records proactively spilled: 0
>
> Job DAG:
> job_201304081613_0032   ->      job_201304081613_0033,
> job_201304081613_0033   ->      job_201304081613_0034,
> job_201304081613_0034   ->      null,
> null
>
>
> 2013-04-10 20:11:23,815 [main] WARN
>
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Encountered Warning
> REDUCER_COUNT_LOW 1 time(s).
> 2013-04-10 20:11:23,815 [main] INFO
>
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Some jobs have faile
> d! Stop running all dependent jobs
> 2013-04-10 20:11:23,815 [main] ERROR org.apache.pig.tools.grunt.GruntParser
+
Mua Ban 2013-04-12, 17:27
+
Cheolsoo Park 2013-04-12, 18:25
+
Mua Ban 2013-04-12, 19:06
+
Johnny Zhang 2013-04-12, 21:01
+
Mua Ban 2013-04-14, 13:13
+
Johnny Zhang 2013-04-15, 17:48
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB