Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> How to load a subset of an HBase table (timestamp based) ?


+
Vincent Barat 2011-07-28, 10:18
Copy link to this message
-
Re: How to load a subset of an HBase table (timestamp based) ?
You can instruct HBaseStorage to load a subset of the rows using the "-gt"
and "-lt" options to HBaseStorage, documented here [1].

I don't believe querying by timestamp is currently supported in Pig, based
on the comments to [2].  There is a standalone JIRA that's been created [3].

Norbert

[1]
http://ofps.oreilly.com/titles/9781449302641/community.html#hbase_options_table
[2] https://issues.apache.org/jira/browse/PIG-1782
[3] https://issues.apache.org/jira/browse/PIG-1832

On Thu, Jul 28, 2011 at 6:18 AM, Vincent Barat <[EMAIL PROTECTED]>wrote:

> Hi,
>
> I'd like to make PIG load only a subset of an HBase table, based on the
> timestamp of the records, or on the key of the rows.
>
> As an example, I'd like to load only records that have a timestamp > N, or
> a key > "something".
>
> I know that HBase can handle scanners that are highly optimized to perform
> this kind of things, and it would greatly improve the time needed to load my
> data.
>
> Is there any way to do this ?
> If not, it is planned to be added in the HBase loader ?
> If not, is it technically possible to do it ?
> If yes, can I contribute and propose a patch on that ?
>
> Thank a lot !
>
+
Vincent Barat 2011-07-28, 12:53
+
Norbert Burger 2011-07-28, 13:00
+
Bill Graham 2011-07-28, 17:26
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB