Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> PigServer vs PigRunner


Copy link to this message
-
Re: PigServer vs PigRunner
On 3/1/12 8:15 AM, Jacob Perkins wrote:
> Hello,
>
> I find myself needing to run a pig script iteratively from within a java
> program. Since I'm writing the data to a db (Cassandra) I can't (as far
> as I can tell) use PigServer's store method.
There is a PigServer.store(String id, String filename, String func)
where you can pass the storefunc  as func.

  Instead I'm using
> registerScript to launch my script. This works swimmingly but for one
> catch; how do I get access to the status of the launched job? When an
> iteration fails I need to stop execution. How can I do this if I have no
> handle to the running job?

registerScript is not meant to be used for actually starting the pig
job, the openIterator or store functions are expected to be used.

>
> It looks like it's possible to use PigRunner instead since it returns an
> ExecJob object. What I'm confused about there though is the ability to
> register jars. Is there a simple example of using PigRunner that
> demonstrates this?

You can register jars using the "-Dpig.addtional.jars=..." option as one
of the args in PigRunner.run().
(Yes, you can pass the -D.. to set properties for the pig job.)

Thanks,
Thejas
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB