Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Best practice for DB connection


Copy link to this message
-
Re: Best practice for DB connection
On Tue, Mar 6, 2012 at 5:02 PM, Mark Kerzner <[EMAIL PROTECTED]>wrote:

> Hi,
>
> I need to initialize the HBase connection, which I normally do in
> configure() in the Mapper, and then my mapper uses it. How do I do it in
> Pig?
>
> I am ready to define a UDF that will return a handle, but is it a best
> practice?
>

yes. you can initialize inside the first call to UDF.exec(). The same UDF
object is used for the entire mapper.

Don't initialize inside the constructor for UDF. AFIK there is no way to
tell how many times and when the constructor is called (though it is no
more than a handful of times on the front end).

Raghu.

> Thank you,
> Mark
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB