Norbert Burger 2011-08-15, 16:19
Bill Graham 2011-08-15, 16:37
-Re: push down filters for HbaseStorage
Norbert Burger 2011-08-15, 17:20
Bill -- thanks for your quick response. I just tried to put together a
debug log for the -gte case to provide more info, and realized that it WAS
working as advertised (map tasks created only for overlapping regions).
Sorry for the false alarm.
Out of curiosity, is there a JIRA option to track the FILTER version of
this? PIG-1205 seems to be an umbrella ticket for all the changes.
On Mon, Aug 15, 2011 at 12:37 PM, Bill Graham <[EMAIL PROTECTED]> wrote:
> I don't think the predicate push-down you're showing in  is currently
> supported, but the -gte param in the constructor definitely is (see
> HBaseTableInputFormat and PIG-1205). If that's not working, then it's a
> bug. Is there anything helpful in the logs?
> On Mon, Aug 15, 2011 at 9:19 AM, Norbert Burger <[EMAIL PROTECTED]
> > Hi folks,
> > We have a ~35 GB Hbase table that's split across several hundred regions.
> > I'm using the Pig version bundled with CDH3u1, which is 0.8.1 plus a few
> > patches. In particular, it includes PIG-1680.
> > With the push down filters from PIG-1680, my thought was that a
> > combo like  would only result in map tasks being created for the
> > that overlap the requested key space (eg., greater than '12344323413').
> > Instead I see a map task being created for every region in the table.
> > my assumption off?
> > Fwiw, I see the same results if I use the -gte param to HbaseStorage.
> > Norbert
> > 
> > cvps = LOAD 'hbase://cvps' USING
> > org.apache.pig.backend.hadoop.hbase.HBaseStorage('data:value','-loadKey')
> > as
> > (rowkey:chararray, datavalue:chararray);
> > A = FILTER cvps BY rowkey > '12344323413';
Bill Graham 2011-08-15, 18:13