Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - RPC KeyValue encoding


Copy link to this message
-
RE: RPC KeyValue encoding
Ramkrishna.S.Vasudevan 2012-09-04, 09:53
Lars

Is it possible to have hooks in this layer while we handle version counting so that we can know what kvs were omitted due to versioning?  

Regards
Ram

> -----Original Message-----
> From: Matt Corgan [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, September 04, 2012 5:25 AM
> To: [EMAIL PROTECTED]; lars hofhansl
> Subject: Re: RPC KeyValue encoding
>
> I created separate reviewboard requests for the hbase-common module and
> the
> hbase-prefix-tree module.  First one has the Cell interface,
> CellOutputStream, CellScanner, etc mentioned above.
>
> hbase-common: https://reviews.apache.org/r/6897/
> hbase-prefix-tree: https://reviews.apache.org/r/6898/
>
> Will leave tests out for now.  They're on github.
>
> On Mon, Sep 3, 2012 at 3:17 PM, lars hofhansl <[EMAIL PROTECTED]>
> wrote:
>
> > That reminds me of another thought to occurred to me while looking at
> > ScanQueryMatcher.
> > I was marveling at the relative complexity of it (together with
> > StoreScanner) - admittedly some of this is my fault (see HBASE-4536
> and
> > HBASE-4071)
> >
> > It would be so much easier to read if we had proper iterator trees
> (at
> > least at some places in the code), similar to how relational database
> > express execution plans (using scanner and iterator interchangeably).
> >
> > Then:
> > - StoreScanner would just read from HFiles and pass KVs up, in fact
> it
> > might no longer be needed
> > - Filters can be expressed as an iterator over those KVs.
> > - Handing deleted KVs would be another iterator
> > - So would be the version handling/counting
> > - Scanner would not need to passed List<KeyValue> to accumulate KVs
> in,
> > but simply return KVs as they are encountered.
> >
> > RegionScanner and StoreScanner would retain the KeyValueHeap to
> mergesort
> > their sub scanners.
> > The overall complexity would remain the same, but the parts would be
> > isolated better.
> >
> > Something like this:
> > RegionScanner -> HeapIt ->-> VersionCounterIt -> FilterIt ->
> TimeRangeIt->
> > DeletesIt -> StoreScanner -> HeapIt ->-> StoreFileIt -> ...
> > (Should rename some of the things)
> >
> > All iterators would issue (re)seeks when possible.
> >
> > The iterator interface would be something like <init>, KV next(),
> close(),
> > seek(KV), reseek(KV). Many Iterators would be stateful.
> >
> > Would probably be a major refactoring, and the devil would be in the
> > details. We would need to be careful to keep the current performance
> > (currently ScanQueryMatcher is efficient, because it does a bunch of
> steps
> > at the same time).
> >
> > Just "blue skying".
> >
> > -- Lars
> >
> >
> > ________________________________
> > From: Matt Corgan <[EMAIL PROTECTED]>
> > To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
> > Sent: Monday, September 3, 2012 11:24 AM
> > Subject: Re: RPC KeyValue encoding
> >
> > >
> > > For CellAppender, is compile() equivalent to flushing ?
> >
> > Yes.  I'll rename CellAppender to CellOutputStream.  The concept is
> very
> > similar to a GzipOutputStream where you write bytes to it and
> periodically
> > call flush() which spits out a compressed byte[] behind the scenes.
> The
> > server would write Cells to a CellOutputStream, flush them to a
> byte[] and
> > send the byte[] to the client.  There could be a default encoding,
> and the
> > client could send a flag to override the default.
> >
> > Greg, you mention omitting fields that are repeated from one KeyValue
> to
> > the next.  I think this is basically what the existing
> DataBlockEncoders
> > are doing for KeyValues stored on disk (see PrefixKeyDeltaEncoder for
> > example).  I'm thinking we can use the same encoders for encoding on
> the
> > wire.  Different implementations will have different performance
> > characteristics where some may be better for disk and others for RPC,
> but
> > the overall intent is the same.
> >
> > Matt
> >
> > On Sun, Sep 2, 2012 at 2:56 PM, lars hofhansl <[EMAIL PROTECTED]>
> wrote:
> >
> > > Your "coarse grain" options is what I had in mind in my email. I