Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Problems using Pig with Oozie


+
Shawn Hermans 2013-02-02, 20:46
+
Jonas Hartwig 2013-02-02, 20:50
+
Shawn Hermans 2013-02-02, 21:42
+
Prashant Kommireddi 2013-02-02, 22:21
+
Harish Krishnan 2013-02-02, 22:30
Copy link to this message
-
Re: Problems using Pig with Oozie
Harish,
Thank you very much.  The job tracker logs were informative.  I think the
issue is how I am storing the data back out to HDFS.  The Pig script will
work with a path like /tmp/foo, but Oozie seems to prefer
${namenode}/tmp/foo.

Thanks,
Shawn
On Sat, Feb 2, 2013 at 4:30 PM, Harish Krishnan <[EMAIL PROTECTED]
> wrote:

> Did you check the job tracker logs? I'm sure that job tracker logs will
> have the appropriate error message. Can you paste it here?
>
> Thanks & Regards,
> Harish.T.K
>
>
> On Sat, Feb 2, 2013 at 2:21 PM, Prashant Kommireddi <[EMAIL PROTECTED]
> >wrote:
>
> > +Oozie user list.
> >
> > I don't see HbaseStorage related errors in the log. May be an Oozie
> > expert can point in the right direction.
> >
> > Sent from my iPhone
> >
> > On Feb 2, 2013, at 3:43 PM, Shawn Hermans <[EMAIL PROTECTED]>
> wrote:
> >
> > > Thank you for your assistance.  I tried that and it did not work.  I
> > looked
> > > at pig-0.10.0 and it looks like HBaseStorage should be included in the
> > main
> > > project and is now longer in Piggybank.   Any other ideas? Is there an
> > easy
> > > way I can see the Pig error messages? Looking at previous discussions,
> it
> > > looks like the only way to get to the original Pig error message is to
> > > write custom Java code to launch the Pig script.
> > >
> > >
> > > On Sat, Feb 2, 2013 at 2:50 PM, Jonas Hartwig <
> [EMAIL PROTECTED]
> > >wrote:
> > >
> > >> You need to supply piggybank
> > >> <file>path/on/hdfs/piggybank.jar#piggybankjar</file>
> > >> And in the pig script
> > >> Register piggybankjar
> > >>
> > >> Jonas
> > >>
> > >> Shawn Hermans <[EMAIL PROTECTED]> schrieb:
> > >>
> > >>
> > >> All,
> > >> I have a Pig script that reads data from HBase using HBaseStorage,
> does
> > >> some manipulation with some Python UDFs and then writes it using
> > >> PigStorage.  It works fine when I run it as a standalone script, but
> > will
> > >> not run in an Oozie workflow.  I can run normal Pig scripts using
> Oozie,
> > >> but run into problems when trying to run this script.  I believe I
> have
> > >> isolated the error to be with loading from HBaseStorage.  I stripped
> > >> everything out of my script except loading from HBaseStorage and
> > outputting
> > >> to PigStorage.  The full script is below.
> > >>
> > >> profiles = LOAD 'hbase://profile' USING
> > >> org.apache.pig.backend.hadoop.hbase.HBaseStorage('e:*') as
> > (columns:map[]);
> > >> limited = LIMIT profiles 200;
> > >> STORE limited into '/tmp/123456' using PigStorage();
> > >>
> > >> The log files are not very helpful.  It gives me an error Launcher
> > ERROR,
> > >> reason: Main class [org.apache.oozie.action.hadoop.PigMain], exit code
> > [2].
> > >> I also included the workflow and logfile below just in case.  Also, I
> am
> > >> running Cloudera 4.1.2.  I added all of the Oozie libraries to HDFS as
> > >> specified in the setup instructions.  I appreciate any help.
> > >>
> > >> Thanks,
> > >> Shawn
> > >>
> > >> <workflow-app xmlns="uri:oozie:workflow:0.3" name="simple-wf">
> > >>    <start to="pig-node"/>
> > >>    <action name="pig-node">
> > >>        <pig>
> > >>            <job-tracker>${jobTracker}</job-tracker>
> > >>            <name-node>${nameNode}</name-node>
> > >>            <prepare>
> > >>                <delete
> path="${nameNode}/user/${wf:user()}/tmp/65321"/>
> > >>            </prepare>
> > >>            <configuration>
> > >>                <property>
> > >>                    <name>mapred.job.queue.name</name>
> > >>                    <value>${queueName}</value>
> > >>                </property>
> > >>                <property>
> > >>                    <name>mapred.compress.map.output</name>
> > >>                    <value>true</value>
> > >>                </property>
> > >>            </configuration>
> > >>            <script>simple.pig</script>
> > >>        </pig>
> > >>        <ok to="end"/>
> > >>        <error to="fail"/>
> > >>    </action>
> > >>    <kill name="fail">
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB