Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> What is the best way to get the total row count?


Copy link to this message
-
Re: What is the best way to get the total row count?
Count the number of rows in a table. This operation may take a LONG
           time (Run '$HADOOP_HOME/bin/hadoop jar hbase.jar rowcount' to run a
           counting mapreduce job). Current count is shown every 1000 rows by
           default. Count interval may be optionally specified. Examples:

           hbase> count 't1'
           hbase> count 't1', 100000
On Mon, Jul 9, 2012 at 11:01 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote:

> If you need the exact count the best way is still RowCounter, maybe
> set a bigger scanner caching?
>
> Another option that works if you only need an estimate is using the
> reported number of KVs per region and then summing them up. Look at
> any of your region servers' web ui and on the right you'll see the
> count per region.
>
> J-D
>
> On Mon, Jul 9, 2012 at 3:41 AM, Gopinathan A <[EMAIL PROTECTED]>
> wrote:
> > Hi,
> >
> >
> >
> > What is the best way to get the total row count?
> >
> >
> >
> > I tried following things,
> >
> > a>     Count 'tablename' in shell prompt: Helpful, only with very less
> > number of records.
> >
> > b>     Runing RowCounter Job: It took almost 8hr to get row count of 2TB
> > data in 3node cluster (16 core system, 48GB RAM)
> >
> > c>   Using AggregationClient: Disk IO is very high (System wait is
> 65-70%,
> > Load factor is almost 110), this makes server to non responsive and makes
> > the clients to go down (Due to RPCTimeOut Exceptions).
> >
> > Thanks & Regards,
> >
> > Gopinathan A
> >
> >
> >
> >
> ****************************************************************************
> > ***********
> > This e-mail and attachments contain confidential information from HUAWEI,
> > which is intended only for the person or entity whose address is listed
> > above. Any use of the information contained herein in any way (including,
> > but not limited to, total or partial disclosure, reproduction, or
> > dissemination) by persons other than the intended recipient's) is
> > prohibited. If you receive this e-mail in error, please notify the
> sender by
> > phone or email immediately and delete it!
> >
> >
> >
>

--

Shashwat Shriparv
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB