Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - Getting Error : java.io.IOException: Spill failed


+
Subhramanian, Deepak 2011-06-01, 13:00
+
Subhramanian, Deepak 2011-06-01, 14:28
+
Thejas M Nair 2011-06-01, 14:53
Copy link to this message
-
Re: Getting Error : java.io.IOException: Spill failed
Thejas M Nair 2011-06-01, 15:46
How big is the output of the join expected to be ? (for example, if you have large number of join keys with same value in both files, the output could be very large). Are you using replicated join ?

Thanks,
Thejas
On 6/1/11 8:33 AM, "Subhramanian, Deepak" <[EMAIL PROTECTED]> wrote:

It looks like I have enough space  in the ec2 server  I am using. I am trying to join a 300mb zipped file with another 2mb zipped file. The pig script worked  when the no of columns in the output was less.

Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda1             10321208   7571832   2225088  78% /
devtmpfs               3944716        52   3944664   1% /dev
tmpfs                  3944716         0   3944716   0% /dev/shm
On 1 June 2011 15:53, Thejas M Nair <[EMAIL PROTECTED]> wrote:
Do you have enough disk space on each node ? It looks like MR is having problem writing/finding a disk to write.
Are you seeing this problem for all pig/MR jobs or just one of them ?

Thanks,
Thejas
On 6/1/11 6:00 AM, "Subhramanian, Deepak" <[EMAIL PROTECTED]> wrote:

I am getting a error while running a Pig Script on a 400MB compressed file.
But the script works fine with a sample input file with 1000 lines. The
error details are given below. Any thoughts ?

2011-06-01 12:12:22,152 [Thread-4] INFO
 org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input
paths (combined) to process : 1
2011-06-01 12:12:22,166 [Thread-4] INFO
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths
to process : 1
2011-06-01 12:12:22,168 [Thread-4] INFO
 org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input
paths (combined) to process : 1
2011-06-01 12:12:22,275 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
2011-06-01 12:12:22,921 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- HadoopJobId: job_201105271444_0064
2011-06-01 12:12:22,921 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- More information at:
http://localhost:50030/jobdetails.jsp?jobid=job_201105271444_0064
2011-06-01 12:12:31,109 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 25% complete
2011-06-01 12:12:46,088 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 33% complete
2011-06-01 12:14:43,279 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 33% complete
2011-06-01 12:15:18,404 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 33% complete
2011-06-01 12:19:08,924 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 33% complete
2011-06-01 12:37:46,492 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 33% complete
2011-06-01 12:37:51,541 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 33% complete
 2011-06-01 12:39:41,488 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- job job_201105271444_0064 has failed! Stop running all dependent jobs
2011-06-01 12:39:41,493 [main] WARN  org.apache.pig.tools.pigstats.JobStats
- unable to get input counter for
hdfs://localhost/user/root/pigdbck/data/impressions/imp1.log.gz
2011-06-01 12:39:41,494 [main] WARN  org.apache.pig.tools.pigstats.JobStats
- unable to get input counter for
hdfs://localhost/user/root/pigdbck/data/matchtables/ad1.log.gz
2011-06-01 12:39:41,494 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete
2011-06-01 12:39:41,523 [main] ERROR org.apache.pig.tools.pigstats.PigStats
- ERROR 2997: Unable to recreate exception from backed error:
java.io.IOException: Spill failed
2011-06-01 12:39:41,524 [main] ERROR
org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2011-06-01 12:39:41,525 [main] INFO  org.apache.pig.tools.pigstats.PigStats
- Script Statistics:

HadoopVersion PigVersion UserId StartedAt FinishedAt Features
0.20.2-cdh3u0 0.8.0-cdh3u0 root 2011-06-01 12:12:17 2011-06-01 12:39:41
HASH_JOIN,FILTER

Failed!

Failed Jobs:
JobId Alias Feature Message Outputs
job_201105271444_0064 advertiser_match,joined,logs,out HASH_JOIN Message:
Job failed! Error - NA hdfs://localhost/user/root/pigdbck/resultimp2,

Input(s):
Failed to read data from
"hdfs://localhost/user/root/pigdbck/data/impressions/imp.log.gz"
Failed to read data from
"hdfs://localhost/user/root/pigdbck/data/matchtables/adv .log.gz"

Output(s):
Failed to produce result in "hdfs://localhost/user/root/pigdbck/resul

Backend error message
java.io.IOException: Spill failed
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1069)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1050)
at java.io.DataOutputStream.writeBoolean(DataOutputStream.java:122)
at
org.apache.pig.impl.io.PigNullableWritable.write(PigNullableWritable.java:122)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:917)
at
org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:573)
at
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:23
+
Subhramanian, Deepak 2011-06-01, 18:05