Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Need help getting started.


Copy link to this message
-
Re: Need help getting started.
Mariano,

Pig 0.12.0 does work with Hadoop-2.2.0, but you need to recompile Pig first.

In your $PIG_HOME, run the following command to rebuild Pig:

`ant clean jar-withouthadoop -Dhadoopversion=23`

Then, try re-running your script.

On 1/9/14, 5:13 PM, Mariano Kamp wrote:
> Hi,
>
> I am trying to run the first pig sample from the Hadoop - The definitive guide - book.
>
> Unfortunately that doesn't work for me.
>
> I downloaded 0.12.0 and got the impression it should work with Hadoop 2.2.
>
>> http://pig.apache.org/releases.html#14+October%2C+2013%3A+release+0.12.0+available
>> 14 October, 2013: release 0.12.0 available
>> This release include several new features such as ASSERT operator, Streaming UDF, new AvroStorage, IN/CASE >operator, BigInteger/BigDecimal data type, support for Windows.
>> Note
>> This release works with Hadoop 0.20.X, 1.X, 0.23.X and 2.X
>
> I use Hadoop 2.x.
>> snow:bin mkamp$ which hadoop
>> /Users/mkamp/hadoop-2.2.0/bin//hadoop
>
>> snow:bin mkamp$ echo $HADOOP_HOME
>> /Users/mkamp/hadoop-2.2.0
>
> But no matter if HADOOP_HOME is set or not I get a couple of errors and it doesn't work if I run the script:
>
>> records = LOAD 'micro-tab/sample.txt'
>> AS (year:chararray, temperature:int, quality:int);
>> DUMP records;
>
> All hell breaks lose and there is a lot of output, but most seems meaningless, warnings about settings that are deprecated in Hadoop, but still delivered by default this way.
>
> Hard to say what is relevant. Here are some excerpts, full output attached as file.
>
>  From the logfile:
>
>> Unexpected System Error Occured: java.lang.IncompatibleClassChangeError: Found interface >org.apache.hadoop.mapreduce.JobContext, but class was expected
>> at >org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.setupUdfEnvAndStores(PigO>utputFormat.java:225)
>
>
>> ERROR 1066: Unable to open iterator for alias records
>>From the console:
>
>> 2014-01-09 22:24:45,976 [main] WARN  org.apache.pig.backend.hadoop20.PigJobControl - falling back to default >JobControl (not using hadoop 0.20 ?)
>> java.lang.NoSuchFieldException: runnerState
>> at java.lang.Class.getDeclaredField(Class.java:1918)
>
> But as a little googling indicated, this is business as usual?
>
>> 2014-01-09 22:24:49,228 [JobControl] ERROR org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl - Error >while trying to run jobs.
>> java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class >was expected
>
>> Input(s):
>> Failed to read data from "hdfs://localhost/user/mkamp/micro-tab/sample.txt"
>
> That last one looks interesting. Maybe I am using it wrong and the reported errors are not related? I wanted to read from the local file system.
>
> So I also changed the script to read from hdfs, but that didn't change the error.
>
> Any ideas where to go from here?
>
> Is it possible to run the latest Hadoop binary download and the latest Pig binary download together?
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB