Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Hbase scans taking a lot of time

Vibhav Mundra 2013-01-25, 09:10
Luke Lu 2013-01-25, 17:31
Vibhav Mundra 2013-01-25, 17:59
Adrien Mogenet 2013-01-25, 18:04
Jean-Marc Spaggiari 2013-01-25, 18:06
Vibhav Mundra 2013-01-25, 18:14
Jean-Marc Spaggiari 2013-01-25, 18:23
lars hofhansl 2013-01-25, 22:00
Copy link to this message
Re: Hbase scans taking a lot of time
Sorry I meant scan caching. (not batching)

 From: lars hofhansl <[EMAIL PROTECTED]>
Sent: Friday, January 25, 2013 2:00 PM
Subject: Re: Hbase scans taking a lot of time
Enable scan batching in Hive.
You're probably performing 300m RPC requests, i.e. you're mostly measuring network latency.

-- Lars

From: Vibhav Mundra <[EMAIL PROTECTED]>
Sent: Friday, January 25, 2013 1:10 AM
Subject: Hbase scans taking a lot of time

I am facing a very strange problem with HBase.

This what I did:
a) Create a table, using pre partioned splits.
b) Also the column familes are zipped with lzo compression.
c) Using the above configuration I am able to populate 2 million row per
min in the Hbase.
d) I have created a table with 300 million odd rows, which roughy took me 3
hours to populate and the data size is of 25GB.

e) But when I query for data the performance I am getting is very bad.
   Basically this is what I am seeing: High CPU, no disk I/O and network
I/O is happening at the rate of 6~7MB secs.
Because of this, if I scan the entries of the table using Hive it is taking
Basically it is taking around 24 hours to scan the table. Any idea, of how
to debug.
Alok Kumar 2013-01-26, 06:07
Shashwat Shriparv 2013-01-25, 19:13
Vibhav Mundra 2013-01-25, 19:25
Shashwat Shriparv 2013-01-25, 19:31
Vibhav Mundra 2013-01-25, 19:37