Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> full table scan


+
Andreas Reiter 2011-06-06, 08:48
Copy link to this message
-
Re: full table scan
How many regions does your table have?

On Mon, Jun 6, 2011 at 4:48 AM, Andreas Reiter <[EMAIL PROTECTED]> wrote:
> hello everybody
>
> i'm trying to scan my hbase table for reporting purposes
> the cluster has 4 servers:
>  - server1: namenode, secondary namenode, jobtracker, hbase master,
> zookeeper1
>  - server2: datanode, tasktracker, hbase regionserver, zookeeper2
>  - server3: datanode, tasktracker, hbase regionserver, zookeeper3
>  - server4: datanode, tasktracker, hbase regionserver
> everything seems to work properly
> versions:
>  - hadoop-0.20.2-CDH3B4
>  - hbase-0.90.1-CDH3B4
>  - zookeeper-3.3.2-CDH3B4
>
>
> at the moment our hbase table has 300000 entries
>
> if i do a table scan over the hbase api  (at the moment without a filter)
> ResultScanner scanner = table.getScanner(...);
>
> it takes about 60 seconds to process, which is actually okey, because all
> records are processed be only one thread sequentially
> BUT it takes approximately the same time, if i do a scan over Map&Reduce job
> using TableInputFormat
>
> i'm definitely doing something wrong, because the processing time is going
> up directly proportional to the number of rows.
> in my understanding, the big advantage of hadoop/hbase is, that huge numbers
> of entries can be processed in parallel and very fast
>
> 300k entries are not much, we expecting this number to be added hourly to
> our cluster, but the processing time is increasing, which is actually not
> acceptable
>
> any one an idea, what i'm doing wrong?
>
> best regards
> andre
>
>

--
Joseph Echeverria
Cloudera, Inc.
443.305.9434
+
Andre Reiter 2011-06-06, 21:27
+
Doug Meil 2011-06-06, 21:30
+
Andre Reiter 2011-06-06, 22:07
+
Ted Yu 2011-06-06, 22:20
+
Christopher Tarnas 2011-06-06, 14:59
+
Himanshu Vashishtha 2011-06-06, 19:41
+
Andre Reiter 2011-06-07, 08:08
+
Stack 2011-06-07, 17:28
+
Andre Reiter 2011-06-08, 04:43
+
Jean-Daniel Cryans 2011-06-10, 18:46
+
Andre Reiter 2011-06-11, 08:36
+
Stack 2011-06-11, 16:41
+
Ted Dunning 2011-06-12, 09:31
+
Stack 2011-06-12, 19:07
+
Andre Reiter 2011-06-21, 05:13
+
Stack 2011-06-21, 05:28
+
Andre Reiter 2011-06-21, 07:02
+
Stack 2011-06-21, 15:01