Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> PigServer vs PigRunner

Copy link to this message
Re: PigServer vs PigRunner
On 3/1/12 8:15 AM, Jacob Perkins wrote:
> Hello,
> I find myself needing to run a pig script iteratively from within a java
> program. Since I'm writing the data to a db (Cassandra) I can't (as far
> as I can tell) use PigServer's store method.
There is a PigServer.store(String id, String filename, String func)
where you can pass the storefunc  as func.

  Instead I'm using
> registerScript to launch my script. This works swimmingly but for one
> catch; how do I get access to the status of the launched job? When an
> iteration fails I need to stop execution. How can I do this if I have no
> handle to the running job?

registerScript is not meant to be used for actually starting the pig
job, the openIterator or store functions are expected to be used.

> It looks like it's possible to use PigRunner instead since it returns an
> ExecJob object. What I'm confused about there though is the ability to
> register jars. Is there a simple example of using PigRunner that
> demonstrates this?

You can register jars using the "-Dpig.addtional.jars=..." option as one
of the args in PigRunner.run().
(Yes, you can pass the -D.. to set properties for the pig job.)