Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> scan is slower after bulk load


+
Amit Sela 2012-11-12, 14:39
+
Marcos Ortiz 2012-11-12, 14:44
+
Michael Segel 2012-11-12, 15:45
+
Mohammad Tariq 2012-11-12, 17:04
+
Bijieshan 2012-11-13, 00:29
Copy link to this message
-
Re: scan is slower after bulk load
Did you end up finding the answer?
How fast is this method of insertion relative to a simple insert of List<Put> ?
On 13 בנוב 2012, at 02:29, Bijieshan <[EMAIL PROTECTED]> wrote:

> I think one possible reason is block caching. Have you turned the block caching off during scanning?
>
> Regards,
>  Jieshan
> ________________________________________
> From: Mohammad Tariq [[EMAIL PROTECTED]]
> Sent: Tuesday, November 13, 2012 1:04
> To: [EMAIL PROTECTED]
> Subject: Re: scan is slower after bulk load
>
> may be because bulk load writes to the same region thus putting the entire
> load on a single region server.
>
> Regards,
>    Mohammad Tariq
>
>
>
> On Mon, Nov 12, 2012 at 9:15 PM, Michael Segel <[EMAIL PROTECTED]>wrote:
>
>> Just a guess... have you done any compactions on the table post bulk load?
>>
>> On Nov 12, 2012, at 8:44 AM, Marcos Ortiz <[EMAIL PROTECTED]> wrote:
>>
>>> Regards, Amit.
>>> Did you tuned the RegionServer where you has that data range hosted?
>>> Why do you say that scans are slower after a bulk load?
>>> Did you test it before bulk load?
>>>
>>> HBase version?
>>>
>>> On 11/12/2012 09:39 AM, Amit Sela wrote:
>>>> Hi all,
>>>>
>>>> Anyone has any idea why scanning over specific range in a table is about
>>>> 20% slower if that data (that specific range) was just inserted into
>> HBase
>>>> using bulk load ?
>>>>
>>>> I do the bulk load programmatically with  LoadIncrementalHFiles.
>>>>
>>>> Thanks.
>>>>
>>>
>>> --
>>>
>>> Marcos Luis Ortíz Valmaseda
>>> about.me/marcosortiz <http://about.me/marcosortiz>
>>> @marcosluis2186 <http://twitter.com/marcosluis2186>
>>>
>>>
>>>
>>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS
>> INFORMATICAS...
>>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>>>
>>> http://www.uci.cu
>>> http://www.facebook.com/universidad.uci
>>> http://www.flickr.com/photos/universidad_uci
>>
>>

+
Amit Sela 2012-11-23, 17:22
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB