|
Something Something
2011-02-03, 06:01
Stack
2011-02-03, 06:20
Something Something
2011-02-03, 06:47
Stack
2011-02-03, 06:55
Something Something
2011-02-03, 19:27
Jonathan Gray
2011-02-03, 20:15
Something Something
2011-02-03, 21:35
Jean-Daniel Cryans
2011-02-03, 22:17
Something Something
2011-02-03, 23:09
|
-
Fastest way to read only the keys of a HTable?Something Something 2011-02-03, 06:01
I want to read only the keys in a table. I tried this...
try { HTable table = new HTable("myTable"); Scan scan = new Scan(); scan.addFamily(Bytes.toBytes("Info")); ResultScanner scanner = table.getScanner(scan); Result result = scanner.next(); while (result != null) { & so on... This was performing fairly well until I added another Family that contains lots of key/value pairs. My understanding was that adding another family wouldn't affect performance of this code because I am explicitly using "Info", but it is. Anyway, in this particular use case, I only care about the "Key" of the row. I don't need any values from any of the families. What's the best way to do this? Please let me know. Thanks.
-
Re: Fastest way to read only the keys of a HTable?Stack 2011-02-03, 06:20
See http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FirstKeyOnlyFilter.html
St.Ack On Thu, Feb 3, 2011 at 6:01 AM, Something Something <[EMAIL PROTECTED]> wrote: > I want to read only the keys in a table. I tried this... > > try { > > HTable table = new HTable("myTable"); > > Scan scan = new Scan(); > > scan.addFamily(Bytes.toBytes("Info")); > > ResultScanner scanner = table.getScanner(scan); > > Result result = scanner.next(); > > while (result != null) { > > & so on... > > This was performing fairly well until I added another Family that contains > lots of key/value pairs. My understanding was that adding another family > wouldn't affect performance of this code because I am explicitly using > "Info", but it is. > > Anyway, in this particular use case, I only care about the "Key" of the row. > I don't need any values from any of the families. What's the best way to > do this? > > Please let me know. Thanks. >
-
Re: Fastest way to read only the keys of a HTable?Something Something 2011-02-03, 06:47
Thanks. So I will add this...
scan.setFilter(new FirstKeyOnlyFilter()); But after I do this... Result result = scanner.next(); There's no... result.getKey() - so what method would give me the Key value? On Wed, Feb 2, 2011 at 10:20 PM, Stack <[EMAIL PROTECTED]> wrote: > See > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FirstKeyOnlyFilter.html > St.Ack > > On Thu, Feb 3, 2011 at 6:01 AM, Something Something > <[EMAIL PROTECTED]> wrote: > > I want to read only the keys in a table. I tried this... > > > > try { > > > > HTable table = new HTable("myTable"); > > > > Scan scan = new Scan(); > > > > scan.addFamily(Bytes.toBytes("Info")); > > > > ResultScanner scanner = table.getScanner(scan); > > > > Result result = scanner.next(); > > > > while (result != null) { > > > > & so on... > > > > This was performing fairly well until I added another Family that > contains > > lots of key/value pairs. My understanding was that adding another family > > wouldn't affect performance of this code because I am explicitly using > > "Info", but it is. > > > > Anyway, in this particular use case, I only care about the "Key" of the > row. > > I don't need any values from any of the families. What's the best way > to > > do this? > > > > Please let me know. Thanks. > > >
-
Re: Fastest way to read only the keys of a HTable?Stack 2011-02-03, 06:55
I don't see a getKey on Result. Use
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Result.html#getRow(). Here is how its used in the shell table.rb class: # Count rows in a table def count(interval = 1000, caching_rows = 10) # We can safely set scanner caching with the first key only filter scan = org.apache.hadoop.hbase.client.Scan.new scan.cache_blocks = false scan.caching = caching_rows scan.setFilter(org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter.new) # Run the scanner scanner = @table.getScanner(scan) count = 0 iter = scanner.iterator # Iterate results while iter.hasNext row = iter.next count += 1 next unless (block_given? && count % interval == 0) # Allow command modules to visualize counting process yield(count, String.from_java_bytes(row.getRow)) end # Return the counter return count end St.Ack On Thu, Feb 3, 2011 at 6:47 AM, Something Something <[EMAIL PROTECTED]> wrote: > Thanks. So I will add this... > > scan.setFilter(new FirstKeyOnlyFilter()); > > But after I do this... > > Result result = scanner.next(); > > There's no... result.getKey() - so what method would give me the Key value? > > > > On Wed, Feb 2, 2011 at 10:20 PM, Stack <[EMAIL PROTECTED]> wrote: > >> See >> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FirstKeyOnlyFilter.html >> St.Ack >> >> On Thu, Feb 3, 2011 at 6:01 AM, Something Something >> <[EMAIL PROTECTED]> wrote: >> > I want to read only the keys in a table. I tried this... >> > >> > try { >> > >> > HTable table = new HTable("myTable"); >> > >> > Scan scan = new Scan(); >> > >> > scan.addFamily(Bytes.toBytes("Info")); >> > >> > ResultScanner scanner = table.getScanner(scan); >> > >> > Result result = scanner.next(); >> > >> > while (result != null) { >> > >> > & so on... >> > >> > This was performing fairly well until I added another Family that >> contains >> > lots of key/value pairs. My understanding was that adding another family >> > wouldn't affect performance of this code because I am explicitly using >> > "Info", but it is. >> > >> > Anyway, in this particular use case, I only care about the "Key" of the >> row. >> > I don't need any values from any of the families. What's the best way >> to >> > do this? >> > >> > Please let me know. Thanks. >> > >> >
-
Re: Fastest way to read only the keys of a HTable?Something Something 2011-02-03, 19:27
Hmm.. performance hasn't improved at all. Do you see anything wrong with
the following code: public List<Partner> getPartners() { ArrayList<Partner> partners = new ArrayList<Partner>(); try { HTable table = new HTable("partner"); Scan scan = new Scan(); scan.setFilter(new FirstKeyOnlyFilter()); ResultScanner scanner = table.getScanner(scan); Result result = scanner.next(); while (result != null) { Partner partner = new Partner(Bytes.toString(result.getRow())); partners.add(partner); result = scanner.next(); } } catch (IOException e) { throw new RuntimeException(e); } return partners; } May be I shouldn't use more than one "column family" in a HTable - but the BigTable paper recommends that, doesn't it? Please advice and thanks for your help. On Wed, Feb 2, 2011 at 10:55 PM, Stack <[EMAIL PROTECTED]> wrote: > I don't see a getKey on Result. Use > > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Result.html#getRow() > . > > Here is how its used in the shell table.rb class: > > # Count rows in a table > def count(interval = 1000, caching_rows = 10) > # We can safely set scanner caching with the first key only filter > scan = org.apache.hadoop.hbase.client.Scan.new > scan.cache_blocks = false > scan.caching = caching_rows > scan.setFilter(org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter.new) > > # Run the scanner > scanner = @table.getScanner(scan) > count = 0 > iter = scanner.iterator > > # Iterate results > while iter.hasNext > row = iter.next > count += 1 > next unless (block_given? && count % interval == 0) > # Allow command modules to visualize counting process > yield(count, String.from_java_bytes(row.getRow)) > end > > # Return the counter > return count > end > > > St.Ack > > On Thu, Feb 3, 2011 at 6:47 AM, Something Something > <[EMAIL PROTECTED]> wrote: > > Thanks. So I will add this... > > > > scan.setFilter(new FirstKeyOnlyFilter()); > > > > But after I do this... > > > > Result result = scanner.next(); > > > > There's no... result.getKey() - so what method would give me the Key > value? > > > > > > > > On Wed, Feb 2, 2011 at 10:20 PM, Stack <[EMAIL PROTECTED]> wrote: > > > >> See > >> > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FirstKeyOnlyFilter.html > >> St.Ack > >> > >> On Thu, Feb 3, 2011 at 6:01 AM, Something Something > >> <[EMAIL PROTECTED]> wrote: > >> > I want to read only the keys in a table. I tried this... > >> > > >> > try { > >> > > >> > HTable table = new HTable("myTable"); > >> > > >> > Scan scan = new Scan(); > >> > > >> > scan.addFamily(Bytes.toBytes("Info")); > >> > > >> > ResultScanner scanner = table.getScanner(scan); > >> > > >> > Result result = scanner.next(); > >> > > >> > while (result != null) { > >> > > >> > & so on... > >> > > >> > This was performing fairly well until I added another Family that > >> contains > >> > lots of key/value pairs. My understanding was that adding another > family > >> > wouldn't affect performance of this code because I am explicitly using > >> > "Info", but it is. > >> > > >> > Anyway, in this particular use case, I only care about the "Key" of > the > >> row. > >> > I don't need any values from any of the families. What's the best > way > >> to > >> > do this? > >> > > >> > Please let me know. Thanks. > >> > > >> > > >
-
RE: Fastest way to read only the keys of a HTable?Jonathan Gray 2011-02-03, 20:15
If you only need to consider a single column family, use Scan.addFamily() on your scanner. Then there will be no impact of the other column families.
> -----Original Message----- > From: Something Something [mailto:[EMAIL PROTECTED]] > Sent: Thursday, February 03, 2011 11:28 AM > To: [EMAIL PROTECTED] > Subject: Re: Fastest way to read only the keys of a HTable? > > Hmm.. performance hasn't improved at all. Do you see anything wrong with > the following code: > > > public List<Partner> getPartners() { > ArrayList<Partner> partners = new ArrayList<Partner>(); > > try { > HTable table = new HTable("partner"); > Scan scan = new Scan(); > scan.setFilter(new FirstKeyOnlyFilter()); > ResultScanner scanner = table.getScanner(scan); > Result result = scanner.next(); > while (result != null) { > Partner partner = new > Partner(Bytes.toString(result.getRow())); > partners.add(partner); > result = scanner.next(); > } > } catch (IOException e) { > throw new RuntimeException(e); > } > return partners; > } > > May be I shouldn't use more than one "column family" in a HTable - but the > BigTable paper recommends that, doesn't it? Please advice and thanks for > your help. > > > > > On Wed, Feb 2, 2011 at 10:55 PM, Stack <[EMAIL PROTECTED]> wrote: > > > I don't see a getKey on Result. Use > > > > > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Result. > > html#getRow() > > . > > > > Here is how its used in the shell table.rb class: > > > > # Count rows in a table > > def count(interval = 1000, caching_rows = 10) > > # We can safely set scanner caching with the first key only filter > > scan = org.apache.hadoop.hbase.client.Scan.new > > scan.cache_blocks = false > > scan.caching = caching_rows > > > > scan.setFilter(org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter.new) > > > > # Run the scanner > > scanner = @table.getScanner(scan) > > count = 0 > > iter = scanner.iterator > > > > # Iterate results > > while iter.hasNext > > row = iter.next > > count += 1 > > next unless (block_given? && count % interval == 0) > > # Allow command modules to visualize counting process > > yield(count, String.from_java_bytes(row.getRow)) > > end > > > > # Return the counter > > return count > > end > > > > > > St.Ack > > > > On Thu, Feb 3, 2011 at 6:47 AM, Something Something > > <[EMAIL PROTECTED]> wrote: > > > Thanks. So I will add this... > > > > > > scan.setFilter(new FirstKeyOnlyFilter()); > > > > > > But after I do this... > > > > > > Result result = scanner.next(); > > > > > > There's no... result.getKey() - so what method would give me the > > > Key > > value? > > > > > > > > > > > > On Wed, Feb 2, 2011 at 10:20 PM, Stack <[EMAIL PROTECTED]> wrote: > > > > > >> See > > >> > > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FirstKe > > yOnlyFilter.html > > >> St.Ack > > >> > > >> On Thu, Feb 3, 2011 at 6:01 AM, Something Something > > >> <[EMAIL PROTECTED]> wrote: > > >> > I want to read only the keys in a table. I tried this... > > >> > > > >> > try { > > >> > > > >> > HTable table = new HTable("myTable"); > > >> > > > >> > Scan scan = new Scan(); > > >> > > > >> > scan.addFamily(Bytes.toBytes("Info")); > > >> > > > >> > ResultScanner scanner = table.getScanner(scan); > > >> > > > >> > Result result = scanner.next(); > > >> > > > >> > while (result != null) { > > >> > > > >> > & so on... > > >> > > > >> > This was performing fairly well until I added another Family that > > >> contains > > >> > lots of key/value pairs. My understanding was that adding > > >> > another > > family > > >> > wouldn't affect performance of this code because I am explicitly > > >> > using "Info", but it is. > > >> > > > >> > Anyway, in this particular use case, I only care about the "Key"
-
Re: Fastest way to read only the keys of a HTable?Something Something 2011-02-03, 21:35
After adding the following line:
scan.addFamily(Bytes.toBytes("Info")); performance improved dramatically (Thank you both!). But now I want it to perform even faster, if possible -:) To read 43 rows, it's taking 2 seconds. Eventually, the 'partner' table may have over 500 entries. I guess, I will try by moving the recently added family to a different table. Do you think that might help? Thanks again. On Thu, Feb 3, 2011 at 12:15 PM, Jonathan Gray <[EMAIL PROTECTED]> wrote: > If you only need to consider a single column family, use Scan.addFamily() > on your scanner. Then there will be no impact of the other column families. > > > -----Original Message----- > > From: Something Something [mailto:[EMAIL PROTECTED]] > > Sent: Thursday, February 03, 2011 11:28 AM > > To: [EMAIL PROTECTED] > > Subject: Re: Fastest way to read only the keys of a HTable? > > > > Hmm.. performance hasn't improved at all. Do you see anything wrong with > > the following code: > > > > > > public List<Partner> getPartners() { > > ArrayList<Partner> partners = new ArrayList<Partner>(); > > > > try { > > HTable table = new HTable("partner"); > > Scan scan = new Scan(); > > scan.setFilter(new FirstKeyOnlyFilter()); > > ResultScanner scanner = table.getScanner(scan); > > Result result = scanner.next(); > > while (result != null) { > > Partner partner = new > > Partner(Bytes.toString(result.getRow())); > > partners.add(partner); > > result = scanner.next(); > > } > > } catch (IOException e) { > > throw new RuntimeException(e); > > } > > return partners; > > } > > > > May be I shouldn't use more than one "column family" in a HTable - but > the > > BigTable paper recommends that, doesn't it? Please advice and thanks for > > your help. > > > > > > > > > > On Wed, Feb 2, 2011 at 10:55 PM, Stack <[EMAIL PROTECTED]> wrote: > > > > > I don't see a getKey on Result. Use > > > > > > > > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Result. > > > html#getRow() > > > . > > > > > > Here is how its used in the shell table.rb class: > > > > > > # Count rows in a table > > > def count(interval = 1000, caching_rows = 10) > > > # We can safely set scanner caching with the first key only filter > > > scan = org.apache.hadoop.hbase.client.Scan.new > > > scan.cache_blocks = false > > > scan.caching = caching_rows > > > > > > scan.setFilter(org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter.new) > > > > > > # Run the scanner > > > scanner = @table.getScanner(scan) > > > count = 0 > > > iter = scanner.iterator > > > > > > # Iterate results > > > while iter.hasNext > > > row = iter.next > > > count += 1 > > > next unless (block_given? && count % interval == 0) > > > # Allow command modules to visualize counting process > > > yield(count, String.from_java_bytes(row.getRow)) > > > end > > > > > > # Return the counter > > > return count > > > end > > > > > > > > > St.Ack > > > > > > On Thu, Feb 3, 2011 at 6:47 AM, Something Something > > > <[EMAIL PROTECTED]> wrote: > > > > Thanks. So I will add this... > > > > > > > > scan.setFilter(new FirstKeyOnlyFilter()); > > > > > > > > But after I do this... > > > > > > > > Result result = scanner.next(); > > > > > > > > There's no... result.getKey() - so what method would give me the > > > > Key > > > value? > > > > > > > > > > > > > > > > On Wed, Feb 2, 2011 at 10:20 PM, Stack <[EMAIL PROTECTED]> wrote: > > > > > > > >> See > > > >> > > > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FirstKe > > > yOnlyFilter.html > > > >> St.Ack > > > >> > > > >> On Thu, Feb 3, 2011 at 6:01 AM, Something Something > > > >> <[EMAIL PROTECTED]> wrote: > > > >> > I want to read only the keys in a table. I tried this... > > > >> >
-
Re: Fastest way to read only the keys of a HTable?Jean-Daniel Cryans 2011-02-03, 22:17
On the scan, you can setCaching with the number of rows you want to
pre-fetch per RPC. Setting it to 2 is already 2x better than the default. J-D On Thu, Feb 3, 2011 at 1:35 PM, Something Something <[EMAIL PROTECTED]> wrote: > After adding the following line: > > scan.addFamily(Bytes.toBytes("Info")); > > performance improved dramatically (Thank you both!). But now I want it to > perform even faster, if possible -:) To read 43 rows, it's taking 2 > seconds. Eventually, the 'partner' table may have over 500 entries. I > guess, I will try by moving the recently added family to a different table. > Do you think that might help? > > Thanks again. > > > On Thu, Feb 3, 2011 at 12:15 PM, Jonathan Gray <[EMAIL PROTECTED]> wrote: > >> If you only need to consider a single column family, use Scan.addFamily() >> on your scanner. Then there will be no impact of the other column families. >> >> > -----Original Message----- >> > From: Something Something [mailto:[EMAIL PROTECTED]] >> > Sent: Thursday, February 03, 2011 11:28 AM >> > To: [EMAIL PROTECTED] >> > Subject: Re: Fastest way to read only the keys of a HTable? >> > >> > Hmm.. performance hasn't improved at all. Do you see anything wrong with >> > the following code: >> > >> > >> > public List<Partner> getPartners() { >> > ArrayList<Partner> partners = new ArrayList<Partner>(); >> > >> > try { >> > HTable table = new HTable("partner"); >> > Scan scan = new Scan(); >> > scan.setFilter(new FirstKeyOnlyFilter()); >> > ResultScanner scanner = table.getScanner(scan); >> > Result result = scanner.next(); >> > while (result != null) { >> > Partner partner = new >> > Partner(Bytes.toString(result.getRow())); >> > partners.add(partner); >> > result = scanner.next(); >> > } >> > } catch (IOException e) { >> > throw new RuntimeException(e); >> > } >> > return partners; >> > } >> > >> > May be I shouldn't use more than one "column family" in a HTable - but >> the >> > BigTable paper recommends that, doesn't it? Please advice and thanks for >> > your help. >> > >> > >> > >> > >> > On Wed, Feb 2, 2011 at 10:55 PM, Stack <[EMAIL PROTECTED]> wrote: >> > >> > > I don't see a getKey on Result. Use >> > > >> > > >> > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Result. >> > > html#getRow() >> > > . >> > > >> > > Here is how its used in the shell table.rb class: >> > > >> > > # Count rows in a table >> > > def count(interval = 1000, caching_rows = 10) >> > > # We can safely set scanner caching with the first key only filter >> > > scan = org.apache.hadoop.hbase.client.Scan.new >> > > scan.cache_blocks = false >> > > scan.caching = caching_rows >> > > >> > > scan.setFilter(org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter.new) >> > > >> > > # Run the scanner >> > > scanner = @table.getScanner(scan) >> > > count = 0 >> > > iter = scanner.iterator >> > > >> > > # Iterate results >> > > while iter.hasNext >> > > row = iter.next >> > > count += 1 >> > > next unless (block_given? && count % interval == 0) >> > > # Allow command modules to visualize counting process >> > > yield(count, String.from_java_bytes(row.getRow)) >> > > end >> > > >> > > # Return the counter >> > > return count >> > > end >> > > >> > > >> > > St.Ack >> > > >> > > On Thu, Feb 3, 2011 at 6:47 AM, Something Something >> > > <[EMAIL PROTECTED]> wrote: >> > > > Thanks. So I will add this... >> > > > >> > > > scan.setFilter(new FirstKeyOnlyFilter()); >> > > > >> > > > But after I do this... >> > > > >> > > > Result result = scanner.next(); >> > > > >> > > > There's no... result.getKey() - so what method would give me the >> > > > Key >> > > value? >> > > > >> > > > >> > > > >> > > > On Wed, Feb 2, 2011 at 10:20 PM, Stack <[EMAIL PROTECTED]> wrote:
-
Re: Fastest way to read only the keys of a HTable?Something Something 2011-02-03, 23:09
Awesome! It's instantaneous now. Thanks a bunch. Any such tricks for code
that looks like this... Get get = new Get(Bytes.toBytes(code)); Result result = table.get(get); NavigableMap<byte[], byte[]> map result.getFamilyMap(Bytes.toBytes("Keys")); if (map != null) { for (Map.Entry<byte[], byte[]> entry : map.entrySet()) { String key = Bytes.toString(entry.getValue()); Get get1 = new Get(Bytes.toBytes(key)); Result imp = table2.get(get1); // Do something with the result... } } Basically, I am reading the first table by a key (code). The "Keys" family contains keys of some other table, so I get each key from that family and retrieve row from the other table. Thanks again. On Thu, Feb 3, 2011 at 2:17 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote: > On the scan, you can setCaching with the number of rows you want to > pre-fetch per RPC. Setting it to 2 is already 2x better than the > default. > > J-D > > On Thu, Feb 3, 2011 at 1:35 PM, Something Something > <[EMAIL PROTECTED]> wrote: > > After adding the following line: > > > > scan.addFamily(Bytes.toBytes("Info")); > > > > performance improved dramatically (Thank you both!). But now I want it > to > > perform even faster, if possible -:) To read 43 rows, it's taking 2 > > seconds. Eventually, the 'partner' table may have over 500 entries. I > > guess, I will try by moving the recently added family to a different > table. > > Do you think that might help? > > > > Thanks again. > > > > > > On Thu, Feb 3, 2011 at 12:15 PM, Jonathan Gray <[EMAIL PROTECTED]> wrote: > > > >> If you only need to consider a single column family, use > Scan.addFamily() > >> on your scanner. Then there will be no impact of the other column > families. > >> > >> > -----Original Message----- > >> > From: Something Something [mailto:[EMAIL PROTECTED]] > >> > Sent: Thursday, February 03, 2011 11:28 AM > >> > To: [EMAIL PROTECTED] > >> > Subject: Re: Fastest way to read only the keys of a HTable? > >> > > >> > Hmm.. performance hasn't improved at all. Do you see anything wrong > with > >> > the following code: > >> > > >> > > >> > public List<Partner> getPartners() { > >> > ArrayList<Partner> partners = new ArrayList<Partner>(); > >> > > >> > try { > >> > HTable table = new HTable("partner"); > >> > Scan scan = new Scan(); > >> > scan.setFilter(new FirstKeyOnlyFilter()); > >> > ResultScanner scanner = table.getScanner(scan); > >> > Result result = scanner.next(); > >> > while (result != null) { > >> > Partner partner = new > >> > Partner(Bytes.toString(result.getRow())); > >> > partners.add(partner); > >> > result = scanner.next(); > >> > } > >> > } catch (IOException e) { > >> > throw new RuntimeException(e); > >> > } > >> > return partners; > >> > } > >> > > >> > May be I shouldn't use more than one "column family" in a HTable - but > >> the > >> > BigTable paper recommends that, doesn't it? Please advice and thanks > for > >> > your help. > >> > > >> > > >> > > >> > > >> > On Wed, Feb 2, 2011 at 10:55 PM, Stack <[EMAIL PROTECTED]> wrote: > >> > > >> > > I don't see a getKey on Result. Use > >> > > > >> > > > >> > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Result > . > >> > > html#getRow() > >> > > . > >> > > > >> > > Here is how its used in the shell table.rb class: > >> > > > >> > > # Count rows in a table > >> > > def count(interval = 1000, caching_rows = 10) > >> > > # We can safely set scanner caching with the first key only > filter > >> > > scan = org.apache.hadoop.hbase.client.Scan.new > >> > > scan.cache_blocks = false > >> > > scan.caching = caching_rows > >> > > > >> > > > scan.setFilter(org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter.new) > >> > > > >> > > # Run the scanner |