Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Best practice for DB connection


Copy link to this message
-
Re: Best practice for DB connection
Out of curiosity, is there an equivalent to .exec() for Python UDFs?  We
had the same issue recently.

Norbert

On Wed, Mar 7, 2012 at 3:27 AM, Raghu Angadi <[EMAIL PROTECTED]> wrote:

> On Tue, Mar 6, 2012 at 5:02 PM, Mark Kerzner <[EMAIL PROTECTED]
> >wrote:
>
> > Hi,
> >
> > I need to initialize the HBase connection, which I normally do in
> > configure() in the Mapper, and then my mapper uses it. How do I do it in
> > Pig?
> >
> > I am ready to define a UDF that will return a handle, but is it a best
> > practice?
> >
>
> yes. you can initialize inside the first call to UDF.exec(). The same UDF
> object is used for the entire mapper.
>
> Don't initialize inside the constructor for UDF. AFIK there is no way to
> tell how many times and when the constructor is called (though it is no
> more than a handful of times on the front end).
>
> Raghu.
>
> > Thank you,
> > Mark
> >
>