|
|
+
Subhramanian, Deepak 2011-06-01, 13:00
+
Subhramanian, Deepak 2011-06-01, 14:28
+
Thejas M Nair 2011-06-01, 14:53
-
Re: Getting Error : java.io.IOException: Spill failedThejas M Nair 2011-06-01, 15:46
How big is the output of the join expected to be ? (for example, if you have large number of join keys with same value in both files, the output could be very large). Are you using replicated join ?
Thanks, Thejas On 6/1/11 8:33 AM, "Subhramanian, Deepak" <[EMAIL PROTECTED]> wrote: It looks like I have enough space in the ec2 server I am using. I am trying to join a 300mb zipped file with another 2mb zipped file. The pig script worked when the no of columns in the output was less. Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda1 10321208 7571832 2225088 78% / devtmpfs 3944716 52 3944664 1% /dev tmpfs 3944716 0 3944716 0% /dev/shm On 1 June 2011 15:53, Thejas M Nair <[EMAIL PROTECTED]> wrote: Do you have enough disk space on each node ? It looks like MR is having problem writing/finding a disk to write. Are you seeing this problem for all pig/MR jobs or just one of them ? Thanks, Thejas On 6/1/11 6:00 AM, "Subhramanian, Deepak" <[EMAIL PROTECTED]> wrote: I am getting a error while running a Pig Script on a 400MB compressed file. But the script works fine with a sample input file with 1000 lines. The error details are given below. Any thoughts ? 2011-06-01 12:12:22,152 [Thread-4] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1 2011-06-01 12:12:22,166 [Thread-4] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2011-06-01 12:12:22,168 [Thread-4] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1 2011-06-01 12:12:22,275 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete 2011-06-01 12:12:22,921 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201105271444_0064 2011-06-01 12:12:22,921 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://localhost:50030/jobdetails.jsp?jobid=job_201105271444_0064 2011-06-01 12:12:31,109 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 25% complete 2011-06-01 12:12:46,088 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 33% complete 2011-06-01 12:14:43,279 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 33% complete 2011-06-01 12:15:18,404 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 33% complete 2011-06-01 12:19:08,924 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 33% complete 2011-06-01 12:37:46,492 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 33% complete 2011-06-01 12:37:51,541 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 33% complete 2011-06-01 12:39:41,488 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_201105271444_0064 has failed! Stop running all dependent jobs 2011-06-01 12:39:41,493 [main] WARN org.apache.pig.tools.pigstats.JobStats - unable to get input counter for hdfs://localhost/user/root/pigdbck/data/impressions/imp1.log.gz 2011-06-01 12:39:41,494 [main] WARN org.apache.pig.tools.pigstats.JobStats - unable to get input counter for hdfs://localhost/user/root/pigdbck/data/matchtables/ad1.log.gz 2011-06-01 12:39:41,494 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete 2011-06-01 12:39:41,523 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 2997: Unable to recreate exception from backed error: java.io.IOException: Spill failed 2011-06-01 12:39:41,524 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed! 2011-06-01 12:39:41,525 [main] INFO org.apache.pig.tools.pigstats.PigStats - Script Statistics: HadoopVersion PigVersion UserId StartedAt FinishedAt Features 0.20.2-cdh3u0 0.8.0-cdh3u0 root 2011-06-01 12:12:17 2011-06-01 12:39:41 HASH_JOIN,FILTER Failed! Failed Jobs: JobId Alias Feature Message Outputs job_201105271444_0064 advertiser_match,joined,logs,out HASH_JOIN Message: Job failed! Error - NA hdfs://localhost/user/root/pigdbck/resultimp2, Input(s): Failed to read data from "hdfs://localhost/user/root/pigdbck/data/impressions/imp.log.gz" Failed to read data from "hdfs://localhost/user/root/pigdbck/data/matchtables/adv .log.gz" Output(s): Failed to produce result in "hdfs://localhost/user/root/pigdbck/resul Backend error message java.io.IOException: Spill failed at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1069) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1050) at java.io.DataOutputStream.writeBoolean(DataOutputStream.java:122) at org.apache.pig.impl.io.PigNullableWritable.write(PigNullableWritable.java:122) at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90) at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:917) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:573) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:23 +
Subhramanian, Deepak 2011-06-01, 18:05
|