Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> prefix compression implementation


Copy link to this message
-
Re: prefix compression implementation
jacek >> It is a huge chance. It would be great if we could prototype a few
things.
Especially I would like to avoid any optimizations before we know a got
way to measure them.

matt >> agree.  i'm not in a rush to get any of this integrated, just trying
to feel out the right long-term strategy.  do you have unit tests that
you're running on a substantial amount of data to compare different
implementations?
On Tue, Sep 20, 2011 at 4:58 PM, Jacek Migdal <[EMAIL PROTECTED]> wrote:

>
>
> On 9/20/11 10:59 AM, "Matt Corgan" <[EMAIL PROTECTED]> wrote:
>
> >bringing all questions into a single email:
> >
> >stack >> I'd say call it Cell rather than HCell.
> >
> >i did think the H was a very simple way to add uniqueness, like isn't
> >"HFile" a big win over "File"?  there are already two other classes called
> >"Cell" in hbase (guava and REST gateway).  another option could be KV,
> >though i don't like making exceptions to java's no-abbreviations
> >guidelines.
> KeyValueCell?
>
> To be honest, no name seems to be a very good option. However, it would be
> nice if it would be somewhat related to KeyValue.
>
> On large scope, it would be hard to integrate this interface anytime soon.
> I would rather do it later.
>
> >stack >> There is a patch lying around that adds a version to KV by using
> >top
> >two bytes of the type byte.  If you need me to dig it up, just say
> >(then you might not have to have v1 stuff in your Interface).
> >
> >not sure what you mean here.  top two bits?  you mean encoding the
> >timestamp
> >inside the type byte?
> Versioning KeyValue per KeyValue seems to be crazy. Shouldn't it be per
> block or file.
>
>
> >(interface discussion)
> >
> It is a huge chance. It would be great if we could prototype a few things.
> Especially I would like to avoid any optimizations before we know a got
> way to measure them.
>
> Jacek
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB