Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> DISCUSS : HFile V3 proposal for tags in 0.96


Copy link to this message
-
Re: DISCUSS : HFile V3 proposal for tags in 0.96
bq. By default code will go with V2.

Good.

Looking forward to the patch.

On Thu, Jul 18, 2013 at 9:57 PM, ramkrishna vasudevan <
[EMAIL PROTECTED]> wrote:

> >>Any consideration that the tags are serialized before the memstoreTS
> instead of after ?
> The argument is basically simple like memstoreTS is optional and that comes
> only in HFile and not in KV.  The tags are as part of the current design
> comes after Value in the KV structure.  Hence the same would be better to
> be applied on HFiles also.
> >>When would PrefixTree be able to handle tags ?
> May be my stmt confused you.  Pls see the point on PrefixTreeEncoders in
> the previous mail.  I meant that as per the current design PrefixKey,
> DiffKey, FastDiff extend BufferedDataEncoders and hence
> BufferedDataEncoders are made tag aware.
>
> PrefixTreecodec has been handled separately to make it work with tags.
> >> Put in another way, after this feature goes in, would
> HFile V3 always be written ?
> By default code will go with V2. So when user says he needs V3 he would
> need to update the hfile.format.version to 3.  This would ensure that the
> system uses V3.
>
> Thanks Ted.
>
> Regards
> Ram
>
>
> On Fri, Jul 19, 2013 at 10:10 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
>
> > bq. V3 would now serailize the tags also after the Value part before the
> > memstoreTS
> >
> > Any consideration that the tags are serialized before the memstoreTS
> > instead of after ?
> >
> > bq. The BuffereddataEncoder, being the base class for all encoders other
> > than PrefixTree would now be tag aware.
> >
> > When would PrefixTree be able to handle tags ?
> >
> > When a new HFile is opened, would user be able to specify that there is
> no
> > tagging involved ? Put in another way, after this feature goes in, would
> > HFile V3 always be written ?
> >
> > Thanks
> >
> > On Thu, Jul 18, 2013 at 9:29 PM, ramkrishna vasudevan <
> > [EMAIL PROTECTED]> wrote:
> >
> > > What changes/differences that we would be introducing in the V3 format
> > > would be (I will put down in words under subcategory)
> > >
> > > To reduce the code duplicate we would subclass ReaderV3 and WriterV3
> from
> > > ReaderV2 and WriterV2 respectively.
> > > *HFileBlockFormat*
> > > *=============*
> > > No change in V2 and V3.
> > >
> > > *KV serialization*
> > > *============*
> > > V2 no change
> > > V3 would now serailize the tags also after the Value part before the
> > > memstoreTS
> > >
> > > *FixedFileTrailer*
> > > *===========*
> > > Introduces a new information into the trailer which can be used in V3
> to
> > > make tags optional.  Suppose take the case that user selects V3 but in
> > one
> > > CF there are no tags.  Then we would write the tag bytes while flushing
> > but
> > > during compaction using this header info we would just avoid writing
> tags
> > > in the compacted files.  This would mean no impact on read performances
> > > after the compaction has been completed.
> > > V2 would code also tries to get this trailer info but this being null
> no
> > > impact on any of the existing code.
> > >
> > > *WriterV3 and ReaderV3*
> > > *=================*
> > > Tries to handle the tags based on the meta data from the trailer info.
> >  All
> > > the apis like seekTo, next(), getKeyValue() are now able to handle tags
> > > based on the flag passed during the construction of the Readers and
> > > Writers.  We can be sure that for any instances of V2 the includeTags
> > flag
> > > would always be false.
> > >
> > > *DataBlockEncoders*
> > > *==============*
> > > Additonal arguments added to the apis in the interfaces related to
> > > HFileDataBlockEncoders, BufferedDataBlockEncoders,
> > > HFileDataBlockEncodingContext etc.  Again for V2 the new apis would
> still
> > > behave the same way and there would be no impact for V2 based usecases.
> > > The BuffereddataEncoder, being the base class for all encoders other
> than
> > > PrefixTree would now be tag aware.
> > >
> > > *PrefixTreeEncoders*
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB