Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Pig UDF question


Copy link to this message
-
Re: Pig UDF question
On 5/15/12 4:18 PM, Mohit Anchlia wrote:
> I am trying to write an UDF that indexes data in elasticsearch after
> converting it to JSON. I had 2 questions:
>
> 1. If I create a static member in UDF class is that one instance per mapper
> task?
Yes, every mapper task uses single jvm , so you would see one instance
of the static member in each mapper task.

> 2. Is there a method that gets called at the end of mapper method that I
> can use for cleanup?
>
> I was wondering if I should rather write a storefunc that would index the
> data. Need some help here, essentially I need some way to initialize search
> Client once and then at the end close it out.
>

Yes, a storefunc is the right way to accomplish this. The RecordWriter
functions can be used to open/close the client.  You might want to look
at  wonderdog mentioned by Russel to see if you can enhance it to meet
your needs.

Thanks,
Thejas
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB