Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Enabling caching increasing the time of retrieval


Copy link to this message
-
Re: Enabling caching increasing the time of retrieval
What is the data set size? Can you manually flush the tables and them
run the tests?

Also can you tell us about your configs a bit. Heap allocation to the
region servers and the percentage of it that's dedicated to the block
cache.

Lastly, how are you enabling and disabling the cache?

On Jun 24, 2012, at 11:37 PM, Prakrati Agrawal
<[EMAIL PROTECTED]> wrote:

> No the DB is on a fully distributed setup.
>
> I wrote the data completely and then started retrieving it and both the tests were done one by one. So I think the possibility of data being in the memstore is not there.
> Please help me
>
> Thanks and Regards
> Prakrati
>
>
> -----Original Message-----
> From: Amandeep Khurana [mailto:[EMAIL PROTECTED]]
> Sent: Monday, June 25, 2012 11:51 AM
> To: [EMAIL PROTECTED]
> Subject: Re: Enabling caching increasing the time of retrieval
>
> Is this on a standalone instance or do you have fully distributed setup deployed? Do you have any kind of monitoring in place?
>
> From the numbers you are giving, it looks like the data is of the order of a few 10 MBs, assuming this is a single threaded read. Did you write more data between the first run (with cache disabled) and the second run (with cache enabled)? It is possible that the data was in the memstore and not yet flushed to HFiles when you did the first test. The memstore flushed and now the reads had to go to disk since the cache was not yet warmed up with that data. Subsequent reads of the same rows should be faster in that case.
>
>
> On Sunday, June 24, 2012 at 11:11 PM, Prakrati Agrawal wrote:
>
>> Dear all
>>
>> I am trying to optimize the retrieval code in Java for HBase. The following are the timings without cache enabled:
>> The time taken to get 175347 columns of a row key is 677 ms
>> The time taken to get rows : 99 and columns: 14888573 is 48806 ms
>> The time taken to get all data (rows: 396 and columns: 32611576) is 96469 ms
>>
>> The time taken after caching is enabled(Both block and setCaching) :
>> The time taken to get 175347 columns of a row key is 713 ms
>> The time taken to get rows : 99 and columns: 14888573 is 57649 ms
>> The time taken to get all data (rows: 396 and columns: 32611576) is 111056 ms
>>
>> As you all can see, time increases after I enable caching. I am not understanding what I am doing wrong. Please help me
>>
>> Thanks and Regards
>> Prakrati
>>
>>
>> ________________________________
>> This email message may contain proprietary, private and confidential information. The information transmitted is intended only for the person(s) or entities to which it is addressed. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited and may be illegal. If you received this in error, please contact the sender and delete the message from your system.
>>
>> Mu Sigma takes all reasonable steps to ensure that its electronic communications are free from viruses. However, given Internet accessibility, the Company cannot accept liability for any virus introduced by this e-mail or any attachment and you are advised to use up-to-date virus checking software.
>
>
>
> This email message may contain proprietary, private and confidential information. The information transmitted is intended only for the person(s) or entities to which it is addressed. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited and may be illegal. If you received this in error, please contact the sender and delete the message from your system.
>
> Mu Sigma takes all reasonable steps to ensure that its electronic communications are free from viruses. However, given Internet accessibility, the Company cannot accept liability for any virus introduced by this e-mail or any attachment and you are advised to use up-to-date virus checking software.