Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # dev >> Regarding memstoreTS in bulkloaded HFiles


+
Anoop Sam John 2012-05-29, 12:55
+
Matt Corgan 2012-05-29, 20:27
+
Anoop Sam John 2012-05-30, 03:57
+
Anoop Sam John 2012-05-30, 12:50
+
Stack 2012-05-30, 18:56
Copy link to this message
-
RE: Regarding memstoreTS in bulkloaded HFiles
@Stack
> As of now we are not able to set any Block encoder algo as part of bulk loading.
HBASE-6040 will address this issue  :)

>How would we Anoop?  If we create KVs during an upload, the KV
>instance will have a memstorets data member?
 Yes Stack. KV instances will have a long member memstoreTS in it defaults to 0L.  In case of bulk load this value will remain as it is.
When we use HFile V2, the writer will include the memstoreTS into bytes written as part of KSs.
As I said in my last mail, this wont use 8 bytes per KV, but 1 byte per KV when the value is 0. We write it as Vlong type.
(Initially I thought it will be 8 bytes which will be like a huge wastage of space). One byte also we need to be considered? When the number of records and KVs per records are highs may be this will take up space...

-Anoop-
________________________________________
From: [EMAIL PROTECTED] [[EMAIL PROTECTED]] on behalf of Stack [[EMAIL PROTECTED]]
Sent: Thursday, May 31, 2012 12:26 AM
To: [EMAIL PROTECTED]
Subject: Re: Regarding memstoreTS in bulkloaded HFiles

On Tue, May 29, 2012 at 5:55 AM, Anoop Sam John <[EMAIL PROTECTED]> wrote:
>            In HFile V2 we have introduced the memstore TS to be getting written to the HFiles. In case of bulk load also, now we are writing a long value as part of every KV. I think in case of the bulk loading there is no meaning for the memstore TS. Can we avoid this?
>

How would we Anoop?  If we create KVs during an upload, the KV
instance will have a memstorets data member?

> As of now we are not able to set any Block encoder algo as part of bulk loading.

This would be a nice feature.  Some of the encodings are intensive so
doing it offline would be sweet for read-heavy deploys.

> Do we need to think making the memstoreTS write into the HFile (in version 2) as some way configurable? In case of bulk loading we can turn it OFF. Pls correct me if my understanding is wrong
>

What you thinking?  This could be a nice addtiion.
St.Ack
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB