Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Need help getting started.


Copy link to this message
-
Re: Need help getting started.
Ruslan Al-Fakikh 2014-01-11, 15:47
Usually Hadoop is used within a distro. Those can be cloudera, hortonworks,
emr, etc
11 янв. 2014 г. 2:05 пользователь "Mariano Kamp" <[EMAIL PROTECTED]>
написал:

> Hi Josh.
>
> Ok, got it. Interesting.
>
> Downloaded ant, recompiled and now it works.
>
> Thank you.
>
>
> On Fri, Jan 10, 2014 at 10:16 PM, Josh Elser <[EMAIL PROTECTED]> wrote:
>
> > Mariano,
> >
> > Pig 0.12.0 does work with Hadoop-2.2.0, but you need to recompile Pig
> > first.
> >
> > In your $PIG_HOME, run the following command to rebuild Pig:
> >
> > `ant clean jar-withouthadoop -Dhadoopversion=23`
> >
> > Then, try re-running your script.
> >
> >
> > On 1/9/14, 5:13 PM, Mariano Kamp wrote:
> >
> >> Hi,
> >>
> >> I am trying to run the first pig sample from the Hadoop - The definitive
> >> guide - book.
> >>
> >> Unfortunately that doesn't work for me.
> >>
> >> I downloaded 0.12.0 and got the impression it should work with Hadoop
> 2.2.
> >>
> >>  http://pig.apache.org/releases.html#14+October%2C+
> >>> 2013%3A+release+0.12.0+available
> >>> 14 October, 2013: release 0.12.0 available
> >>> This release include several new features such as ASSERT operator,
> >>> Streaming UDF, new AvroStorage, IN/CASE >operator,
> BigInteger/BigDecimal
> >>> data type, support for Windows.
> >>> Note
> >>> This release works with Hadoop 0.20.X, 1.X, 0.23.X and 2.X
> >>>
> >>
> >> I use Hadoop 2.x.
> >>
> >>> snow:bin mkamp$ which hadoop
> >>> /Users/mkamp/hadoop-2.2.0/bin//hadoop
> >>>
> >>
> >>  snow:bin mkamp$ echo $HADOOP_HOME
> >>> /Users/mkamp/hadoop-2.2.0
> >>>
> >>
> >> But no matter if HADOOP_HOME is set or not I get a couple of errors and
> >> it doesn't work if I run the script:
> >>
> >>  records = LOAD 'micro-tab/sample.txt'
> >>> AS (year:chararray, temperature:int, quality:int);
> >>> DUMP records;
> >>>
> >>
> >> All hell breaks lose and there is a lot of output, but most seems
> >> meaningless, warnings about settings that are deprecated in Hadoop, but
> >> still delivered by default this way.
> >>
> >> Hard to say what is relevant. Here are some excerpts, full output
> >> attached as file.
> >>
> >>  From the logfile:
> >>
> >>  Unexpected System Error Occured:
> java.lang.IncompatibleClassChangeError:
> >>> Found interface >org.apache.hadoop.mapreduce.JobContext, but class was
> >>> expected
> >>> at >org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.
> >>> PigOutputFormat.setupUdfEnvAndStores(PigO>utputFormat.java:225)
> >>>
> >>
> >>
> >>  ERROR 1066: Unable to open iterator for alias records
> >>> From the console:
> >>>
> >>
> >>  2014-01-09 22:24:45,976 [main] WARN
>  org.apache.pig.backend.hadoop20.PigJobControl
> >>> - falling back to default >JobControl (not using hadoop 0.20 ?)
> >>> java.lang.NoSuchFieldException: runnerState
> >>> at java.lang.Class.getDeclaredField(Class.java:1918)
> >>>
> >>
> >> But as a little googling indicated, this is business as usual?
> >>
> >>  2014-01-09 22:24:49,228 [JobControl] ERROR
> org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl
> >>> - Error >while trying to run jobs.
> >>> java.lang.IncompatibleClassChangeError: Found interface
> >>> org.apache.hadoop.mapreduce.JobContext, but class >was expected
> >>>
> >>
> >>  Input(s):
> >>> Failed to read data from "hdfs://localhost/user/mkamp/
> >>> micro-tab/sample.txt"
> >>>
> >>
> >> That last one looks interesting. Maybe I am using it wrong and the
> >> reported errors are not related? I wanted to read from the local file
> >> system.
> >>
> >> So I also changed the script to read from hdfs, but that didn't change
> >> the error.
> >>
> >> Any ideas where to go from here?
> >>
> >> Is it possible to run the latest Hadoop binary download and the latest
> >> Pig binary download together?
> >>
> >>
>