Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - Problems using Pig with Oozie


+
Shawn Hermans 2013-02-02, 20:46
+
Jonas Hartwig 2013-02-02, 20:50
+
Shawn Hermans 2013-02-02, 21:42
+
Prashant Kommireddi 2013-02-02, 22:21
+
Harish Krishnan 2013-02-02, 22:30
Copy link to this message
-
Re: Problems using Pig with Oozie
Shawn Hermans 2013-02-03, 21:51
Harish,
Thank you very much.  The job tracker logs were informative.  I think the
issue is how I am storing the data back out to HDFS.  The Pig script will
work with a path like /tmp/foo, but Oozie seems to prefer
${namenode}/tmp/foo.

Thanks,
Shawn
On Sat, Feb 2, 2013 at 4:30 PM, Harish Krishnan <[EMAIL PROTECTED]
> wrote:

> Did you check the job tracker logs? I'm sure that job tracker logs will
> have the appropriate error message. Can you paste it here?
>
> Thanks & Regards,
> Harish.T.K
>
>
> On Sat, Feb 2, 2013 at 2:21 PM, Prashant Kommireddi <[EMAIL PROTECTED]
> >wrote:
>
> > +Oozie user list.
> >
> > I don't see HbaseStorage related errors in the log. May be an Oozie
> > expert can point in the right direction.
> >
> > Sent from my iPhone
> >
> > On Feb 2, 2013, at 3:43 PM, Shawn Hermans <[EMAIL PROTECTED]>
> wrote:
> >
> > > Thank you for your assistance.  I tried that and it did not work.  I
> > looked
> > > at pig-0.10.0 and it looks like HBaseStorage should be included in the
> > main
> > > project and is now longer in Piggybank.   Any other ideas? Is there an
> > easy
> > > way I can see the Pig error messages? Looking at previous discussions,
> it
> > > looks like the only way to get to the original Pig error message is to
> > > write custom Java code to launch the Pig script.
> > >
> > >
> > > On Sat, Feb 2, 2013 at 2:50 PM, Jonas Hartwig <
> [EMAIL PROTECTED]
> > >wrote:
> > >
> > >> You need to supply piggybank
> > >> <file>path/on/hdfs/piggybank.jar#piggybankjar</file>
> > >> And in the pig script
> > >> Register piggybankjar
> > >>
> > >> Jonas
> > >>
> > >> Shawn Hermans <[EMAIL PROTECTED]> schrieb:
> > >>
> > >>
> > >> All,
> > >> I have a Pig script that reads data from HBase using HBaseStorage,
> does
> > >> some manipulation with some Python UDFs and then writes it using
> > >> PigStorage.  It works fine when I run it as a standalone script, but
> > will
> > >> not run in an Oozie workflow.  I can run normal Pig scripts using
> Oozie,
> > >> but run into problems when trying to run this script.  I believe I
> have
> > >> isolated the error to be with loading from HBaseStorage.  I stripped
> > >> everything out of my script except loading from HBaseStorage and
> > outputting
> > >> to PigStorage.  The full script is below.
> > >>
> > >> profiles = LOAD 'hbase://profile' USING
> > >> org.apache.pig.backend.hadoop.hbase.HBaseStorage('e:*') as
> > (columns:map[]);
> > >> limited = LIMIT profiles 200;
> > >> STORE limited into '/tmp/123456' using PigStorage();
> > >>
> > >> The log files are not very helpful.  It gives me an error Launcher
> > ERROR,
> > >> reason: Main class [org.apache.oozie.action.hadoop.PigMain], exit code
> > [2].
> > >> I also included the workflow and logfile below just in case.  Also, I
> am
> > >> running Cloudera 4.1.2.  I added all of the Oozie libraries to HDFS as
> > >> specified in the setup instructions.  I appreciate any help.
> > >>
> > >> Thanks,
> > >> Shawn
> > >>
> > >> <workflow-app xmlns="uri:oozie:workflow:0.3" name="simple-wf">
> > >>    <start to="pig-node"/>
> > >>    <action name="pig-node">
> > >>        <pig>
> > >>            <job-tracker>${jobTracker}</job-tracker>
> > >>            <name-node>${nameNode}</name-node>
> > >>            <prepare>
> > >>                <delete
> path="${nameNode}/user/${wf:user()}/tmp/65321"/>
> > >>            </prepare>
> > >>            <configuration>
> > >>                <property>
> > >>                    <name>mapred.job.queue.name</name>
> > >>                    <value>${queueName}</value>
> > >>                </property>
> > >>                <property>
> > >>                    <name>mapred.compress.map.output</name>
> > >>                    <value>true</value>
> > >>                </property>
> > >>            </configuration>
> > >>            <script>simple.pig</script>
> > >>        </pig>
> > >>        <ok to="end"/>
> > >>        <error to="fail"/>
> > >>    </action>
> > >>    <kill name="fail">