Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> DISCUSS : HFile V3 proposal for tags in 0.96


Copy link to this message
-
Re: DISCUSS : HFile V3 proposal for tags in 0.96
Thanks Ram and Anoop for those details again. I don't think there is a need
to be able to revert from V3 to V2. And 1 byte overhead on an HFile is not
really an overhead. As Anoop proposed, if there is a way to de-activate the
tags feature when all the KVs in a file are having tag length as zero, then
it's all good!

Looking forward to test that!

JM

2013/7/19 ramkrishna vasudevan <[EMAIL PROTECTED]>

> But am afraid that once the user switches to V3 with tags he cannot come
> back to V2.  If this scenario is possible then we need to see a work around
> for that?
> Particularly in the case if the user has written the tags and tries to read
> it back with V2 then it would not work.
>
> If user switches to V3 but does not write any tags then if we go with the
> option of making tags optional using the Fileinfo then atleast after the
> compaction is done the Hfile could be read with the V2 reader also.  But i
> don't think the user would intend to do this given the fact that he needs
> tags for his usecase.
>
> Regards
> Ram
>
>
> On Fri, Jul 19, 2013 at 5:21 PM, Anoop John <[EMAIL PROTECTED]> wrote:
>
> > Jean
> >         When V2 will be used there wont any extra bytes and so no
> overhead
> > in write or read paths.
> > When V3 is used, and there are no tags present at all, we will have extra
> > bytes for writing tag length.  Trying to put tag length as VInt so that
> > this will be 1 byte only.  Then using File infos we can avoid overhead.
> >
> > Say when all the KVs in a file are having tag length as zero( a filer
> > trailer indicate this) , during read we can avoid the read and decode of
> > teh tag length. Just skip one byte of tag length.
> >
> > Regarding avoiding the tag length (even the 1 byte fully)  maybe during
> > compaction it should be possible. But whether really needed I am
> thinikng.
> > User can select V3 when there is a need for Tags.
> >
> > -Anoop-
> >
> > On Fri, Jul 19, 2013 at 4:53 PM, Jean-Marc Spaggiari <
> > [EMAIL PROTECTED]> wrote:
> >
> > > Thanks Ram.
> > >
> > > One last. Space wise. If I understand correctly, between V2 and V3,
> when
> > > tags are de-activated, there will be only a 1 bit difference, so same
> > > storage space used. If tags are activated but empty, is it going to be
> > the
> > > same thing? Or are we going to have all the tags overhead? Like can we
> > have
> > > a byte to say "no tags in that file" in addition to "tags are activated
> > for
> > > that file"?
> > >
> > > So 2 questions.
> > >
> > > 1) what the overhead on disk space from the tags.
> > > 2) should we have a flag(bit) per file to say no tags even if activated
> > to
> > > limit this overhead and ket people activate it for futur uses?
> > >
> > > JMS
> > > Le 2013-07-19 07:11, "ramkrishna vasudevan" <
> > > [EMAIL PROTECTED]> a écrit :
> > >
> > > > >>Based on your details, I think it will be, but very minimal, or
> > > > almost invisible, correct?
> > > > Yes of course.
> > > > Regarding migration, any file written with V2 would still be read
> with
> > > > HFileReaderV2 and the new files will be written with V3.  So there
> > should
> > > > not be any problem here.  We are anyway testing these things to  make
> > > sure
> > > > we don't break anywhere.  Thanks Jean for the interest.
> > > >
> > > > @Stack
> > > > I would write up on the changes foreseen for the Codec changes to
> > support
> > > > RPC and HFileV3.
> > > > Discussing with Anoop, we have some benefits when the Tags are
> written
> > as
> > > > the byte array and when tags are in memory.  Anyway that i would
> write
> > up
> > > > in a seperate thread also considering the inputs on the current way
> the
> > > > patch has been made.
> > > >
> > > > Regards
> > > > Ram
> > > >
> > > >
> > > > On Fri, Jul 19, 2013 at 4:32 PM, Jean-Marc Spaggiari <
> > > > [EMAIL PROTECTED]> wrote:
> > > >
> > > > > Like Ted and St.Ack, I read all of this with a great interest and
> > > > > everything looked good to me.