Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> DISCUSS : HFile V3 proposal for tags in 0.96


Copy link to this message
-
Re: DISCUSS : HFile V3 proposal for tags in 0.96
Would tags be visible to methods of BaseRegionObserver, other than
AccessController ?

Meaning, would other (non-secure) components of HBase be able to use cell
tagging to store certain information ?

Please clarify.

Thanks

On Fri, Jul 19, 2013 at 6:09 AM, Jean-Marc Spaggiari <
[EMAIL PROTECTED]> wrote:

> Thanks Ram and Anoop for those details again. I don't think there is a need
> to be able to revert from V3 to V2. And 1 byte overhead on an HFile is not
> really an overhead. As Anoop proposed, if there is a way to de-activate the
> tags feature when all the KVs in a file are having tag length as zero, then
> it's all good!
>
> Looking forward to test that!
>
> JM
>
> 2013/7/19 ramkrishna vasudevan <[EMAIL PROTECTED]>
>
> > But am afraid that once the user switches to V3 with tags he cannot come
> > back to V2.  If this scenario is possible then we need to see a work
> around
> > for that?
> > Particularly in the case if the user has written the tags and tries to
> read
> > it back with V2 then it would not work.
> >
> > If user switches to V3 but does not write any tags then if we go with the
> > option of making tags optional using the Fileinfo then atleast after the
> > compaction is done the Hfile could be read with the V2 reader also.  But
> i
> > don't think the user would intend to do this given the fact that he needs
> > tags for his usecase.
> >
> > Regards
> > Ram
> >
> >
> > On Fri, Jul 19, 2013 at 5:21 PM, Anoop John <[EMAIL PROTECTED]>
> wrote:
> >
> > > Jean
> > >         When V2 will be used there wont any extra bytes and so no
> > overhead
> > > in write or read paths.
> > > When V3 is used, and there are no tags present at all, we will have
> extra
> > > bytes for writing tag length.  Trying to put tag length as VInt so that
> > > this will be 1 byte only.  Then using File infos we can avoid overhead.
> > >
> > > Say when all the KVs in a file are having tag length as zero( a filer
> > > trailer indicate this) , during read we can avoid the read and decode
> of
> > > teh tag length. Just skip one byte of tag length.
> > >
> > > Regarding avoiding the tag length (even the 1 byte fully)  maybe during
> > > compaction it should be possible. But whether really needed I am
> > thinikng.
> > > User can select V3 when there is a need for Tags.
> > >
> > > -Anoop-
> > >
> > > On Fri, Jul 19, 2013 at 4:53 PM, Jean-Marc Spaggiari <
> > > [EMAIL PROTECTED]> wrote:
> > >
> > > > Thanks Ram.
> > > >
> > > > One last. Space wise. If I understand correctly, between V2 and V3,
> > when
> > > > tags are de-activated, there will be only a 1 bit difference, so same
> > > > storage space used. If tags are activated but empty, is it going to
> be
> > > the
> > > > same thing? Or are we going to have all the tags overhead? Like can
> we
> > > have
> > > > a byte to say "no tags in that file" in addition to "tags are
> activated
> > > for
> > > > that file"?
> > > >
> > > > So 2 questions.
> > > >
> > > > 1) what the overhead on disk space from the tags.
> > > > 2) should we have a flag(bit) per file to say no tags even if
> activated
> > > to
> > > > limit this overhead and ket people activate it for futur uses?
> > > >
> > > > JMS
> > > > Le 2013-07-19 07:11, "ramkrishna vasudevan" <
> > > > [EMAIL PROTECTED]> a écrit :
> > > >
> > > > > >>Based on your details, I think it will be, but very minimal, or
> > > > > almost invisible, correct?
> > > > > Yes of course.
> > > > > Regarding migration, any file written with V2 would still be read
> > with
> > > > > HFileReaderV2 and the new files will be written with V3.  So there
> > > should
> > > > > not be any problem here.  We are anyway testing these things to
>  make
> > > > sure
> > > > > we don't break anywhere.  Thanks Jean for the interest.
> > > > >
> > > > > @Stack
> > > > > I would write up on the changes foreseen for the Codec changes to
> > > support
> > > > > RPC and HFileV3.
> > > > > Discussing with Anoop, we have some benefits when the Tags are