Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Pig > 0.10 always throws invalid stream header


Copy link to this message
-
Pig > 0.10 always throws invalid stream header
Hi guys,

For some reason I cannot setup any version higher than Pig 0.10 with
Hadoop 1.2.1 and Cassandra 1.2.10. For example, using Pig 0.12 when I
try a very simple dump I get this error from JobTracker log:

2013-11-05 17:44:12,000 INFO org.apache.hadoop.mapred.TaskInProgress:
Error from attempt_201311051740_0002_m_000002_0: java.io.IOException:
Deserialization error: invalid stream header: 2DD01810
     at
org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:55)
     at org.apache.pig.impl.util.UDFContext.deserialize(UDFContext.java:192)
     at
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil.setupUDFContext(MapRedUtil.java:159)
     at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.setupUdfEnvAndStores(PigOutputFormat.java:229)
     at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getOutputCommitter(PigOutputFormat.java:275)
     at org.apache.hadoop.mapred.Task.initialize(Task.java:515)
     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
     at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
     at java.security.AccessController.doPrivileged(Native Method)
     at javax.security.auth.Subject.doAs(Subject.java:415)
     at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
     at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.io.StreamCorruptedException: invalid stream header: 2DD01810
     at
java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:802)
     at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
     at
org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:52)
     ... 11 more

When I change to Pig 0.10 everything goes fine..

Now for the record, things I've tried:

- Compile Pig 0.12 / 0.11
- Compile using ant clean jar-withouthadoop -Dhadoopversion=20
- Compile using ant clean jar-withouthadoop -Dhadoopversion=23 (mega
fail due to hadoop 1.2)
- Compile Hadoop to get 1.2.2 instead default 1.2.1
- Compile Cassandra 1.2.10 (Included Pig 0.10 into examples dir works
fine too)

I want to leverage Pig 0.12 but this problem is getting me nuts. Can
someone tell me what I'm doing wrong?
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB