Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Loading data, hbase slower than Hive?


+
Austin Chungath 2013-01-17, 16:44
+
Michael Segel 2013-01-17, 16:48
+
Anoop John 2013-01-17, 17:00
Copy link to this message
-
Re: Loading data, hbase slower than Hive?
Hive is more for batch and HBase is for more of real time data.

Regards
Ram

On Thu, Jan 17, 2013 at 10:30 PM, Anoop John <[EMAIL PROTECTED]> wrote:

> In case of Hive data insertion means placing the file under table path in
> HDFS.  HBase need to read the data and convert it into its format. (HFiles)
> MR is doing this work..  So this makes it clear that HBase will be slower.
> :)  As Michael said the read operation...
>
>
>
> -Anoop-
>
> On Thu, Jan 17, 2013 at 10:14 PM, Austin Chungath <[EMAIL PROTECTED]
> >wrote:
>
> >   Hi,
> > Problem: hive took 6 mins to load a data set, hbase took 1 hr 14 mins.
> > It's a 20 gb data set approx 230 million records. The data is in hdfs,
> > single text file. The cluster is 11 nodes, 8 cores.
> >
> > I loaded this in hive, partitioned by date and bucketed into 32 and
> sorted.
> > Time taken is 6 mins.
> >
> > I loaded the same data into hbase, in the same cluster by writing a map
> > reduce code. It took 1hr 14 mins. The cluster wasn't running anything
> else
> > and assuming that the code that i wrote is good enough, what is it that
> > makes hbase slower than hive in loading the data?
> >
> > Thanks,
> > Austin
> >
>
+
Mohammad Tariq 2013-01-17, 17:46
+
praveenesh kumar 2013-01-18, 17:57
+
Doug Meil 2013-01-18, 18:00
+
Asaf Mesika 2013-01-19, 19:50
+
Mohammad Tariq 2013-01-19, 21:12
+
Doug Meil 2013-01-20, 15:13
+
Vikas Jadhav 2013-01-20, 18:04
+
Austin Chungath 2013-01-21, 05:45
+
Anoop Sam John 2013-01-21, 05:54
+
Austin Chungath 2013-01-21, 06:16
+
Mohammad Tariq 2013-01-21, 06:31
+
Anoop Sam John 2013-01-21, 06:36
+
Mohammad Tariq 2013-01-21, 06:39