Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> HBase get from within UDF vs. PIG FILTER

Copy link to this message
Re: HBase get from within UDF vs. PIG FILTER
UDF could be faster some of the accesses. We do use a lookup UDF in some of
the scripts.

Looking up 6% of  the rows might be a bit high for some tables.

On Fri, Aug 19, 2011 at 9:16 AM, Norbert Burger <[EMAIL PROTECTED]>wrote:

> I have a need within a larger Pig script to pull just a few records from an
> Hbase table.  I know the exact key, so it'd be trivial with a get() from a
> UDF.  Another alternative is use to a custom LOAD/FILTER combo, but this
> would involve filtering off all but 3 of about 50 records.
> From a performance angle, punting over to the UDF is faster, right?
> Although it seems to break the model of only using UDFs when necessary...
> How are others handling this situation?
> Norbert