Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Getting Error : java.io.IOException: Spill failed


Copy link to this message
-
Re: Getting Error : java.io.IOException: Spill failed
How big is the output of the join expected to be ? (for example, if you have large number of join keys with same value in both files, the output could be very large). Are you using replicated join ?

Thanks,
Thejas
On 6/1/11 8:33 AM, "Subhramanian, Deepak" <[EMAIL PROTECTED]> wrote:

It looks like I have enough space  in the ec2 server  I am using. I am trying to join a 300mb zipped file with another 2mb zipped file. The pig script worked  when the no of columns in the output was less.

Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda1             10321208   7571832   2225088  78% /
devtmpfs               3944716        52   3944664   1% /dev
tmpfs                  3944716         0   3944716   0% /dev/shm
On 1 June 2011 15:53, Thejas M Nair <[EMAIL PROTECTED]> wrote:
Do you have enough disk space on each node ? It looks like MR is having problem writing/finding a disk to write.
Are you seeing this problem for all pig/MR jobs or just one of them ?

Thanks,
Thejas
On 6/1/11 6:00 AM, "Subhramanian, Deepak" <[EMAIL PROTECTED]> wrote:

I am getting a error while running a Pig Script on a 400MB compressed file.
But the script works fine with a sample input file with 1000 lines. The
error details are given below. Any thoughts ?

2011-06-01 12:12:22,152 [Thread-4] INFO
 org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input
paths (combined) to process : 1
2011-06-01 12:12:22,166 [Thread-4] INFO
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths
to process : 1
2011-06-01 12:12:22,168 [Thread-4] INFO
 org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input
paths (combined) to process : 1
2011-06-01 12:12:22,275 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
2011-06-01 12:12:22,921 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- HadoopJobId: job_201105271444_0064
2011-06-01 12:12:22,921 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- More information at:
http://localhost:50030/jobdetails.jsp?jobid=job_201105271444_0064
2011-06-01 12:12:31,109 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 25% complete
2011-06-01 12:12:46,088 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 33% complete
2011-06-01 12:14:43,279 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 33% complete
2011-06-01 12:15:18,404 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 33% complete
2011-06-01 12:19:08,924 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 33% complete
2011-06-01 12:37:46,492 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 33% complete
2011-06-01 12:37:51,541 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 33% complete
 2011-06-01 12:39:41,488 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- job job_201105271444_0064 has failed! Stop running all dependent jobs
2011-06-01 12:39:41,493 [main] WARN  org.apache.pig.tools.pigstats.JobStats
- unable to get input counter for
hdfs://localhost/user/root/pigdbck/data/impressions/imp1.log.gz
2011-06-01 12:39:41,494 [main] WARN  org.apache.pig.tools.pigstats.JobStats
- unable to get input counter for
hdfs://localhost/user/root/pigdbck/data/matchtables/ad1.log.gz
2011-06-01 12:39:41,494 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete
2011-06-01 12:39:41,523 [main] ERROR org.apache.pig.tools.pigstats.PigStats
- ERROR 2997: Unable to recreate exception from backed error:
java.io.IOException: Spill failed
2011-06-01 12:39:41,524 [main] ERROR
org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2011-06-01 12:39:41,525 [main] INFO  org.apache.pig.tools.pigstats.PigStats
- Script Statistics:

HadoopVersion PigVersion UserId StartedAt FinishedAt Features
0.20.2-cdh3u0 0.8.0-cdh3u0 root 2011-06-01 12:12:17 2011-06-01 12:39:41
HASH_JOIN,FILTER

Failed!

Failed Jobs:
JobId Alias Feature Message Outputs
job_201105271444_0064 advertiser_match,joined,logs,out HASH_JOIN Message:
Job failed! Error - NA hdfs://localhost/user/root/pigdbck/resultimp2,

Input(s):
Failed to read data from
"hdfs://localhost/user/root/pigdbck/data/impressions/imp.log.gz"
Failed to read data from
"hdfs://localhost/user/root/pigdbck/data/matchtables/adv .log.gz"

Output(s):
Failed to produce result in "hdfs://localhost/user/root/pigdbck/resul

Backend error message
java.io.IOException: Spill failed
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1069)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1050)
at java.io.DataOutputStream.writeBoolean(DataOutputStream.java:122)
at
org.apache.pig.impl.io.PigNullableWritable.write(PigNullableWritable.java:122)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:917)
at
org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:573)
at
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:23
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB