Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> HBase get from within UDF vs. PIG FILTER


Copy link to this message
-
Re: HBase get from within UDF vs. PIG FILTER
UDF could be faster some of the accesses. We do use a lookup UDF in some of
the scripts.

Looking up 6% of  the rows might be a bit high for some tables.

Raghu.
On Fri, Aug 19, 2011 at 9:16 AM, Norbert Burger <[EMAIL PROTECTED]>wrote:

> I have a need within a larger Pig script to pull just a few records from an
> Hbase table.  I know the exact key, so it'd be trivial with a get() from a
> UDF.  Another alternative is use to a custom LOAD/FILTER combo, but this
> would involve filtering off all but 3 of about 50 records.
>
> From a performance angle, punting over to the UDF is faster, right?
> Although it seems to break the model of only using UDFs when necessary...
>
> How are others handling this situation?
>
> Norbert
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB