Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - Regarding memstoreTS in bulkloaded HFiles


Copy link to this message
-
Re: Regarding memstoreTS in bulkloaded HFiles
Matt Corgan 2012-05-29, 20:27
Hi Anoop,

I'm working on the Trie encoding you mentioned.  Just to confirm - it does
support encoding the memstore timestamp, and in the case that they are all
0, it will not take up any space.

I think the other DataBlockEncoders also write it to disk.  See
PrefixTrieDataBlockEncoder.afterEncodingKeyValue(..)

As for whether it's ever needed in a bulk load, I unfortunately don't know.
 My guess would be no, or that it's too exotic of a use case to worry
about.  Maybe someone else can confirm.  But, I'd say you might as well
support the option to include it since it will not take up any space after
encoded.

Matt

On Tue, May 29, 2012 at 5:55 AM, Anoop Sam John <[EMAIL PROTECTED]> wrote:

> Hi Devs
>
>            In HFile V2 we have introduced the memstore TS to be getting
> written to the HFiles. In case of bulk load also, now we are writing a long
> value as part of every KV. I think in case of the bulk loading there is no
> meaning for the memstore TS. Can we avoid this?
>
>
>
> As of now we are not able to set any Block encoder algo as part of bulk
> loading. But I have created HBASE-6040 which solves this. I have checked
> the current available encoder algos but none of them handles the memstoreTS
> as such. There is a new type of trie encoder issue open. In this it seems
> it will handle this kind of scenario. Only one long value will get stored
> as memstoreTS for one block.    Still thes all makes it mandatory that some
> block encoder scheme to be used.
>
>
>
> Do we need to think making the memstoreTS write into the HFile (in version
> 2) as some way configurable? In case of bulk loading we can turn it OFF.
> Pls correct me if my understanding is wrong
>
>
>
> -Anoop-
>