Vincent Barat 2011-07-28, 10:18
Norbert Burger 2011-07-28, 11:19
Vincent Barat 2011-07-28, 12:53
-Re: How to load a subset of an HBase table (timestamp based) ?
Norbert Burger 2011-07-28, 13:00
 is titled with respect to storage, but if you read through the comments
of , Dmitriy mentions that it'll also include querying.
On Thu, Jul 28, 2011 at 8:53 AM, Vincent Barat <[EMAIL PROTECTED]>wrote:
> Thanks for the input,  is more related to timestamp storage, anyway I
> added my 2 cents to the issue concerning loading by timestamp.
> Le 28/07/11 13:19, Norbert Burger a écrit :
> You can instruct HBaseStorage to load a subset of the rows using the "-gt"
>> and "-lt" options to HBaseStorage, documented here .
>> I don't believe querying by timestamp is currently supported in Pig, based
>> on the comments to . There is a standalone JIRA that's been created
>>  https://issues.apache.org/**jira/browse/PIG-1782<https://issues.apache.org/jira/browse/PIG-1782>
>>  https://issues.apache.org/**jira/browse/PIG-1832<https://issues.apache.org/jira/browse/PIG-1832>
>> On Thu, Jul 28, 2011 at 6:18 AM, Vincent Barat<[EMAIL PROTECTED]>**
>>> I'd like to make PIG load only a subset of an HBase table, based on the
>>> timestamp of the records, or on the key of the rows.
>>> As an example, I'd like to load only records that have a timestamp> N,
>>> a key> "something".
>>> I know that HBase can handle scanners that are highly optimized to
>>> this kind of things, and it would greatly improve the time needed to load
>>> Is there any way to do this ?
>>> If not, it is planned to be added in the HBase loader ?
>>> If not, is it technically possible to do it ?
>>> If yes, can I contribute and propose a patch on that ?
>>> Thank a lot !
Bill Graham 2011-07-28, 17:26