Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # dev >> Cell Encoders and usage of Cell


+
ramkrishna vasudevan 2013-04-17, 17:16
+
ramkrishna vasudevan 2013-04-18, 07:54
+
Stack 2013-04-19, 01:28
+
ramkrishna vasudevan 2013-04-19, 03:19
+
Matt Corgan 2013-04-21, 21:47
+
Nick Dimiduk 2013-04-22, 00:08
+
Matt Corgan 2013-04-22, 00:36
Copy link to this message
-
Re: Cell Encoders and usage of Cell
Just adding to what Matt said,
Cell and KeyValue are the same.

Just that in cell you have individual byte arrays carrying the Row, family,
qualifier, Type and timestamp.
So it is basically saving us from the internal size it occupies.  Also it
helps us to use the same interface between the RPC and also the Hfile.
To make the HFile understand these Cells we need to do some work here.

Regards
Ram
On Mon, Apr 22, 2013 at 6:06 AM, Matt Corgan <[EMAIL PROTECTED]> wrote:

> I'm not 100% clear what you're asking Nick.  My understanding is that Cell
> and KeyValue are identical with regards to the timestamp.  Timestamp is
> part of the identity of the Cell/KeyValue, and each has 1 and only 1
> timestamp from a logical perspective.
>
> From a physical/memory perspective, KeyValue is one implementation of Cell
> where all fields are fully expanded into a single continuous byte[].  The
> Cell interface adds the ability for a timestamp to be shared behind the
> scenes to save memory.  In the case where there are 100 KeyValues in an RPC
> result or disk block, the KeyValue implementation will require 800b of
> memory, but the Cell interface will de-duplicate them and store as little
> as ~8b for the whole RPC or disk block.
>
>
> On Sun, Apr 21, 2013 at 5:08 PM, Nick Dimiduk <[EMAIL PROTECTED]> wrote:
>
> > A related question. Can you clarify the distinction between a Cell and a
> > KeyValue as pertains to the timestamp? That is, which of these two
> concepts
> > carries the timestamp as a component of its coordinates? Does a Cell
> > contain multiple KeyValue versions or does a KeyValue contain multiple
> Cell
> > versions?
> >
> > In HBASE-7233, patch v9, I see KeyValue is replaced by Cell in the Get
> > result, which implies to me that a Cell contains multiple KeyValue
> > versions. I don't see the imported Cell.proto. Presumably that's the same
> > Cell type defined in hbase.proto currently on trunk.
> >
> > Thanks,
> > Nick
> >
> > On Sun, Apr 21, 2013 at 2:47 PM, Matt Corgan <[EMAIL PROTECTED]>
> wrote:
> >
> > > fyi Ram - i started adding the Cell interface to the read path of the
> > delta
> > > encoders in HBASE-7323 <
> https://issues.apache.org/jira/browse/HBASE-7323
> > >.
> > >  It's one possible place to start working on it.
> > >
> > >
> > > On Thu, Apr 18, 2013 at 8:19 PM, ramkrishna vasudevan <
> > > [EMAIL PROTECTED]> wrote:
> > >
> > > > Thanks for your reply Stack.
> > > > >I think so.  hfile APIs are about KVs.  Should be about Cell I'd
> > think.
> > > > Yes.  This is what i too think.
> > > >
> > > > >If you need the above, you are no doing Cell right I'd argue.  The
> > very
> > > > idea of Cell is a disconnect between how it is stored and Cell use.
> > > >
> > > > Yes Stack.  I understand this.  I am not introducing the getKeyOffset
> > and
> > > > getKeyLength over there.
> > > > My questions were mainly because, if i have the current code  and i
> > would
> > > > want to introduce tags in it, where would i do it?
> > > > So if i need tags to be introduced should i start changing the HFile
> > > > formats also and only then i would be getting the tags to work?
> > > > What do you think here?
> > > >
> > > > > I think the Cell
> > > > Interface needs methods added to allow access to "labels".
> > > > Yes.  You are right.
> > > >
> > > >
> > > >
> > > > On Fri, Apr 19, 2013 at 6:58 AM, Stack <[EMAIL PROTECTED]> wrote:
> > > >
> > > > > On Wed, Apr 17, 2013 at 10:16 AM, ramkrishna vasudevan <
> > > > > [EMAIL PROTECTED]> wrote:
> > > > >
> > > > > > Hi
> > > > > >
> > > > > > With the introduction of the new Cell Interface we are providing
> a
> > > way
> > > > > > where both the RPC usage of cell and the usage of Cell in HFile
> are
> > > > > > unified.(abstracted)
> > > > > >
> > > > > > The current block encoder which encodes the kvs into hfile blocks
> > > will
> > > > be
> > > > > > enhanced may be BlockEncode2 which will deal with Cell encoding
> and
> > > the
> > > > > > same will be written to HFile.
+
Stack 2013-04-22, 19:28
+
Andrew Purtell 2013-04-22, 22:16
+
ramkrishna vasudevan 2013-04-23, 06:40
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB