Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Hbase Count Aggregate Function


Copy link to this message
-
RE: Hbase Count Aggregate Function

yeah scan gives the correct number of rows, while count returns the total number of rows.

Both are using the same filter, I even tried it using Java API, using row count method.

rowCount = aggregationClient.rowCount(TABLE_NAME, null, scan);

I get the total number of rows not the number of rows filtered.

So any idea ??

Thanks Ram :)

> Date: Mon, 24 Dec 2012 21:57:54 +0530
> Subject: Re: Hbase Count Aggregate Function
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
>
> So you find that scan with a filter and count with the same filter is
> giving you different results?
>
> Regards
> Ram
>
> On Mon, Dec 24, 2012 at 8:33 PM, Dalia Sobhy <[EMAIL PROTECTED]>wrote:
>
> >
> > Dear all,
> >
> > I have 50,000 row with diagnosis qualifier = "cardiac", and another 50,000
> > rows with "renal".
> >
> > When I type this in Hbase shell,
> >
> > import org.apache.hadoop.hbase.filter.CompareFilter
> > import org.apache.hadoop.hbase.filter.SingleColumnValueFilter
> > import org.apache.hadoop.hbase.filter.SubstringComparator
> > import org.apache.hadoop.hbase.util.Bytes
> >
> > scan 'patient', { COLUMNS => "info:diagnosis", FILTER =>
> >     SingleColumnValueFilter.new(Bytes.toBytes('info'),
> >          Bytes.toBytes('diagnosis'),
> >          CompareFilter::CompareOp.valueOf('EQUAL'),
> >          SubstringComparator.new('cardiac'))}
> >
> > Output = 50,000 row
> >
> > import org.apache.hadoop.hbase.filter.CompareFilter
> > import org.apache.hadoop.hbase.filter.SingleColumnValueFilter
> > import org.apache.hadoop.hbase.filter.SubstringComparator
> > import org.apache.hadoop.hbase.util.Bytes
> >
> > count 'patient', { COLUMNS => "info:diagnosis", FILTER =>
> >     SingleColumnValueFilter.new(Bytes.toBytes('info'),
> >          Bytes.toBytes('diagnosis'),
> >          CompareFilter::CompareOp.valueOf('EQUAL'),
> >          SubstringComparator.new('cardiac'))}
> > Output = 100,000 row
> >
> > Even though I tried it using Hbase Java API, Aggregation Client Instance,
> > and I enabled the Coprocessor aggregation for the table.
> > rowCount = aggregationClient.rowCount(TABLE_NAME, null, scan)
> >
> > Also when measuring the improved performance on case of adding more nodes
> > the operation takes the same time.
> >
> > So any advice please?
> >
> > I have been throughout all this mess from a couple of weeks
> >
> > Thanks,