Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Review request for HBASE-7692: Ordered byte[] serialization


Copy link to this message
-
Re: Review request for HBASE-7692: Ordered byte[] serialization
Nick,

While I believe having an order-preserving canonical serialization is a
good idea,  from doing a read of the mail and a skim of the jira it is not
clear to my why this is inside hbase as part of hbase-common.

Why isn't this part of a library on top of hbase (a dependency for
Pig/Hive) instead of "inside" hbase?
Can't this functionality be done just from the client level?
What's the end goal hee? Is the goal here to replace the Bytes.toBytes(*)
methods to enforced the ordering?
If I HBase has two mutually incompatible encodings "built-in", how does a
dev know to use one or the other later on?
If this is essentially a mega import of a library (300k.. yikes) , why not
make it a separate module instead of part of common?

Jon.

On Thu, Feb 21, 2013 at 10:35 AM, Nick Dimiduk <[EMAIL PROTECTED]> wrote:

> Hi everyone,
>
> I'm of the opinion that HBase should provide a mechanism for serializing
> common java types such that the serialized format sorts according the
> the natural ordering of the type. I think many application efforts end up
> building a custom, partial implementation of this kind of functionality on
> their own. I think HBase should provide a canonical implementation of such
> a serialization format so that third-parties can reliably build on top of
> HBase. Not just user applications, but other tools like Pig and Hive are
> also enabled. Implementations for
> HIVE-3634<https://issues.apache.org/jira/browse/HIVE-3634>,
> HIVE-2599 <https://issues.apache.org/jira/browse/HIVE-2599>, or
> HIVE-2903<https://issues.apache.org/jira/browse/HIVE-2903>could be
> compatible with similar features in Pig.
>
> After implementing something similar on multiple occasions, stumbled across
> the Orderly <https://github.com/ndimiduk/orderly> library. It's also
> appears to have been adopted by other large projects, including
> Lily<https://github.com/NGDATA/orderly>.
> I've engaged the library's author for some improvements only to find out
> he's now at Google and will no longer be maintaining it. Thus, I propose we
> take it into HBase.
>
> HBASE-7692 <https://issues.apache.org/jira/browse/HBASE-7692> includes a
> patch that introduces Orderly into hbase-common under the orderly
> namespace. I have an associated branch on
> gihub<https://github.com/ndimiduk/hbase/commits/7692-ordered-serialization
> >wherein
> I've broken the patch out into multiple commits to ease review.
> Please take a few minutes to give it a look.
>
> Thanks,
> Nick
>

--
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// [EMAIL PROTECTED]