Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> HBase get from within UDF vs. PIG FILTER


Copy link to this message
-
HBase get from within UDF vs. PIG FILTER
I have a need within a larger Pig script to pull just a few records from an
Hbase table.  I know the exact key, so it'd be trivial with a get() from a
UDF.  Another alternative is use to a custom LOAD/FILTER combo, but this
would involve filtering off all but 3 of about 50 records.

>From a performance angle, punting over to the UDF is faster, right?
Although it seems to break the model of only using UDFs when necessary...

How are others handling this situation?

Norbert
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB