Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # dev >> DISCUSS : HFile V3 proposal for tags in 0.96


+
ramkrishna vasudevan 2013-07-18, 17:14
+
Ted Yu 2013-07-18, 17:23
+
Jimmy Xiang 2013-07-18, 17:55
+
ramkrishna vasudevan 2013-07-19, 04:29
+
Ted Yu 2013-07-19, 04:40
Copy link to this message
-
Re: DISCUSS : HFile V3 proposal for tags in 0.96
>>Any consideration that the tags are serialized before the memstoreTS
instead of after ?
The argument is basically simple like memstoreTS is optional and that comes
only in HFile and not in KV.  The tags are as part of the current design
comes after Value in the KV structure.  Hence the same would be better to
be applied on HFiles also.
>>When would PrefixTree be able to handle tags ?
May be my stmt confused you.  Pls see the point on PrefixTreeEncoders in
the previous mail.  I meant that as per the current design PrefixKey,
DiffKey, FastDiff extend BufferedDataEncoders and hence
BufferedDataEncoders are made tag aware.

PrefixTreecodec has been handled separately to make it work with tags.
>> Put in another way, after this feature goes in, would
HFile V3 always be written ?
By default code will go with V2. So when user says he needs V3 he would
need to update the hfile.format.version to 3.  This would ensure that the
system uses V3.

Thanks Ted.

Regards
Ram
On Fri, Jul 19, 2013 at 10:10 AM, Ted Yu <[EMAIL PROTECTED]> wrote:

> bq. V3 would now serailize the tags also after the Value part before the
> memstoreTS
>
> Any consideration that the tags are serialized before the memstoreTS
> instead of after ?
>
> bq. The BuffereddataEncoder, being the base class for all encoders other
> than PrefixTree would now be tag aware.
>
> When would PrefixTree be able to handle tags ?
>
> When a new HFile is opened, would user be able to specify that there is no
> tagging involved ? Put in another way, after this feature goes in, would
> HFile V3 always be written ?
>
> Thanks
>
> On Thu, Jul 18, 2013 at 9:29 PM, ramkrishna vasudevan <
> [EMAIL PROTECTED]> wrote:
>
> > What changes/differences that we would be introducing in the V3 format
> > would be (I will put down in words under subcategory)
> >
> > To reduce the code duplicate we would subclass ReaderV3 and WriterV3 from
> > ReaderV2 and WriterV2 respectively.
> > *HFileBlockFormat*
> > *=============*
> > No change in V2 and V3.
> >
> > *KV serialization*
> > *============*
> > V2 no change
> > V3 would now serailize the tags also after the Value part before the
> > memstoreTS
> >
> > *FixedFileTrailer*
> > *===========*
> > Introduces a new information into the trailer which can be used in V3 to
> > make tags optional.  Suppose take the case that user selects V3 but in
> one
> > CF there are no tags.  Then we would write the tag bytes while flushing
> but
> > during compaction using this header info we would just avoid writing tags
> > in the compacted files.  This would mean no impact on read performances
> > after the compaction has been completed.
> > V2 would code also tries to get this trailer info but this being null no
> > impact on any of the existing code.
> >
> > *WriterV3 and ReaderV3*
> > *=================*
> > Tries to handle the tags based on the meta data from the trailer info.
>  All
> > the apis like seekTo, next(), getKeyValue() are now able to handle tags
> > based on the flag passed during the construction of the Readers and
> > Writers.  We can be sure that for any instances of V2 the includeTags
> flag
> > would always be false.
> >
> > *DataBlockEncoders*
> > *==============*
> > Additonal arguments added to the apis in the interfaces related to
> > HFileDataBlockEncoders, BufferedDataBlockEncoders,
> > HFileDataBlockEncodingContext etc.  Again for V2 the new apis would still
> > behave the same way and there would be no impact for V2 based usecases.
> > The BuffereddataEncoder, being the base class for all encoders other than
> > PrefixTree would now be tag aware.
> >
> > *PrefixTreeEncoders*
> > *==============*
> > Trying to keep changes minimal here but would ensure that there are no
> > behaviourial changes while using PrefixTree with V2.
> >
> > *KeyValue class*
> > *===========*
> > Wil include changes to have a Tag class inside this.  Apis to identify
> tags
> > in a KV would be needed.  Util method changes also would be there.
> >
+
Ted Yu 2013-07-19, 05:00
+
Stack 2013-07-19, 05:12
+
Jean-Marc Spaggiari 2013-07-19, 11:02
+
ramkrishna vasudevan 2013-07-19, 11:11
+
Jean-Marc Spaggiari 2013-07-19, 11:23
+
Anoop John 2013-07-19, 11:51
+
ramkrishna vasudevan 2013-07-19, 12:00
+
Jean-Marc Spaggiari 2013-07-19, 13:09
+
Ted Yu 2013-07-19, 14:18
+
Anoop John 2013-07-19, 15:12
+
ramkrishna vasudevan 2013-07-19, 15:05
+
Andrew Purtell 2013-07-19, 16:27
+
Ted Yu 2013-07-19, 16:32
+
Anoop John 2013-07-19, 17:13
+
Ted Yu 2013-07-19, 17:35
+
Elliott Clark 2013-07-19, 17:52
+
Andrew Purtell 2013-07-19, 18:01
+
Elliott Clark 2013-07-19, 21:02
+
Andrew Purtell 2013-07-19, 22:34
+
Stack 2013-07-19, 23:31
+
ramkrishna vasudevan 2013-07-20, 02:10
+
Andrew Purtell 2013-07-22, 17:23
+
Jean-Marc Spaggiari 2013-07-23, 22:43
+
Andrew Purtell 2013-07-24, 17:33
+
ramkrishna vasudevan 2013-07-25, 18:09
+
Andrew Purtell 2013-07-19, 16:48
+
Ted Yu 2013-07-24, 17:30
+
Andrew Purtell 2013-07-19, 16:25
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB