Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Pig UDF question

Copy link to this message
Re: Pig UDF question
On 5/15/12 4:18 PM, Mohit Anchlia wrote:
> I am trying to write an UDF that indexes data in elasticsearch after
> converting it to JSON. I had 2 questions:
> 1. If I create a static member in UDF class is that one instance per mapper
> task?
Yes, every mapper task uses single jvm , so you would see one instance
of the static member in each mapper task.

> 2. Is there a method that gets called at the end of mapper method that I
> can use for cleanup?
> I was wondering if I should rather write a storefunc that would index the
> data. Need some help here, essentially I need some way to initialize search
> Client once and then at the end close it out.

Yes, a storefunc is the right way to accomplish this. The RecordWriter
functions can be used to open/close the client.  You might want to look
at  wonderdog mentioned by Russel to see if you can enhance it to meet
your needs.