Thanks for the reference, Yes I am aware of it but I can't use it as is.
For my future references also it would be good for me to know:
1. If I create a static member in UDF class is that one instance per mapper
task?
2. Is there a method that gets called at the end of mapper method that I
can use for cleanup?
On the same subject is it better to index in UDF or storefunc? I am trying
to see how to decide in this case where you are interacting with external
system.
On Tue, May 15, 2012 at 6:03 PM, Russell Jurney <[EMAIL PROTECTED]>wrote:
> Are you aware of Wonderdog, which already does this? Unfortunately,
> finding reusable pig components can be very hard, as they exist across
> many proprietary projects.
>
>
https://github.com/infochimps/wonderdog> A post explaining how to use it, end to end, is here:
>
>
http://www.quora.com/Autocomplete/What-is-the-best-way-to-implement-an-autocomplete-search-feature-when-dealing-with-large-data-sets>
> Russell Jurney
http://datasyndrome.com>
> On May 15, 2012, at 4:18 PM, Mohit Anchlia <[EMAIL PROTECTED]> wrote:
>
> > I am trying to write an UDF that indexes data in elasticsearch after
> > converting it to JSON. I had 2 questions:
> >
> > 1. If I create a static member in UDF class is that one instance per
> mapper
> > task?
> > 2. Is there a method that gets called at the end of mapper method that I
> > can use for cleanup?
> >
> > I was wondering if I should rather write a storefunc that would index the
> > data. Need some help here, essentially I need some way to initialize
> search
> > Client once and then at the end close it out.
>