Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Loading data, hbase slower than Hive?


+
Austin Chungath 2013-01-17, 16:44
Copy link to this message
-
Re: Loading data, hbase slower than Hive?
The writes take longer in HBase.

Just how much longer may depend on how well you tuned HBase.

Now, having said that... suppose you want to find a single record in either HBase or Hive.
Which do you think will be faster? ;-)
On Jan 17, 2013, at 10:44 AM, Austin Chungath <[EMAIL PROTECTED]> wrote:

>  Hi,
> Problem: hive took 6 mins to load a data set, hbase took 1 hr 14 mins.
> It's a 20 gb data set approx 230 million records. The data is in hdfs,
> single text file. The cluster is 11 nodes, 8 cores.
>
> I loaded this in hive, partitioned by date and bucketed into 32 and sorted.
> Time taken is 6 mins.
>
> I loaded the same data into hbase, in the same cluster by writing a map
> reduce code. It took 1hr 14 mins. The cluster wasn't running anything else
> and assuming that the code that i wrote is good enough, what is it that
> makes hbase slower than hive in loading the data?
>
> Thanks,
> Austin
+
Anoop John 2013-01-17, 17:00
+
ramkrishna vasudevan 2013-01-17, 17:09
+
Mohammad Tariq 2013-01-17, 17:46
+
praveenesh kumar 2013-01-18, 17:57
+
Doug Meil 2013-01-18, 18:00
+
Asaf Mesika 2013-01-19, 19:50
+
Mohammad Tariq 2013-01-19, 21:12
+
Doug Meil 2013-01-20, 15:13
+
Vikas Jadhav 2013-01-20, 18:04
+
Austin Chungath 2013-01-21, 05:45
+
Anoop Sam John 2013-01-21, 05:54
+
Austin Chungath 2013-01-21, 06:16
+
Mohammad Tariq 2013-01-21, 06:31
+
Anoop Sam John 2013-01-21, 06:36
+
Mohammad Tariq 2013-01-21, 06:39
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB