Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> HBase get from within UDF vs. PIG FILTER

Copy link to this message
HBase get from within UDF vs. PIG FILTER
I have a need within a larger Pig script to pull just a few records from an
Hbase table.  I know the exact key, so it'd be trivial with a get() from a
UDF.  Another alternative is use to a custom LOAD/FILTER combo, but this
would involve filtering off all but 3 of about 50 records.

>From a performance angle, punting over to the UDF is faster, right?
Although it seems to break the model of only using UDFs when necessary...

How are others handling this situation?

Raghu Angadi 2011-08-19, 16:30