Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # dev >> Review request for HBASE-7692: Ordered byte[] serialization


+
Nick Dimiduk 2013-02-21, 18:35
+
Jonathan Hsieh 2013-02-21, 23:04
Copy link to this message
-
Re: Review request for HBASE-7692: Ordered byte[] serialization
I think we have to enable building stuff on top of HBase by having well defined building blocks as part of HBase.
It seems to me that a canonical supported byte representation for datatypes is such a building block.

-- Lars

________________________________
 From: Jonathan Hsieh <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Sent: Thursday, February 21, 2013 3:04 PM
Subject: Re: Review request for HBASE-7692: Ordered byte[] serialization
 
Nick,

While I believe having an order-preserving canonical serialization is a
good idea,  from doing a read of the mail and a skim of the jira it is not
clear to my why this is inside hbase as part of hbase-common.

Why isn't this part of a library on top of hbase (a dependency for
Pig/Hive) instead of "inside" hbase?
Can't this functionality be done just from the client level?
What's the end goal hee? Is the goal here to replace the Bytes.toBytes(*)
methods to enforced the ordering?
If I HBase has two mutually incompatible encodings "built-in", how does a
dev know to use one or the other later on?
If this is essentially a mega import of a library (300k.. yikes) , why not
make it a separate module instead of part of common?

Jon.

On Thu, Feb 21, 2013 at 10:35 AM, Nick Dimiduk <[EMAIL PROTECTED]> wrote:

> Hi everyone,
>
> I'm of the opinion that HBase should provide a mechanism for serializing
> common java types such that the serialized format sorts according the
> the natural ordering of the type. I think many application efforts end up
> building a custom, partial implementation of this kind of functionality on
> their own. I think HBase should provide a canonical implementation of such
> a serialization format so that third-parties can reliably build on top of
> HBase. Not just user applications, but other tools like Pig and Hive are
> also enabled. Implementations for
> HIVE-3634<https://issues.apache.org/jira/browse/HIVE-3634>,
> HIVE-2599 <https://issues.apache.org/jira/browse/HIVE-2599>, or
> HIVE-2903<https://issues.apache.org/jira/browse/HIVE-2903>could be
> compatible with similar features in Pig.
>
> After implementing something similar on multiple occasions, stumbled across
> the Orderly <https://github.com/ndimiduk/orderly> library. It's also
> appears to have been adopted by other large projects, including
> Lily<https://github.com/NGDATA/orderly>.
> I've engaged the library's author for some improvements only to find out
> he's now at Google and will no longer be maintaining it. Thus, I propose we
> take it into HBase.
>
> HBASE-7692 <https://issues.apache.org/jira/browse/HBASE-7692> includes a
> patch that introduces Orderly into hbase-common under the orderly
> namespace. I have an associated branch on
> gihub<https://github.com/ndimiduk/hbase/commits/7692-ordered-serialization
> >wherein
> I've broken the patch out into multiple commits to ease review.
> Please take a few minutes to give it a look.
>
> Thanks,
> Nick
>

--
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// [EMAIL PROTECTED]
+
Enis Söztutar 2013-02-22, 03:23
+
Jonathan Hsieh 2013-02-22, 07:56
+
Nick Dimiduk 2013-02-22, 14:13
+
Jonathan Hsieh 2013-02-22, 14:31
+
Elliott Clark 2013-02-22, 17:32
+
Matt Corgan 2013-02-22, 18:00
+
Nick Dimiduk 2013-02-22, 18:04
+
Matt Corgan 2013-02-22, 18:14
+
Nick Dimiduk 2013-02-22, 18:48
+
Nick Dimiduk 2013-02-22, 19:37
+
Ted Yu 2013-02-22, 21:14
+
Stack 2013-02-26, 23:13
+
Jesse Yates 2013-02-22, 18:01
+
Nick Dimiduk 2013-02-22, 23:13
+
Jonathan Hsieh 2013-02-23, 00:33
+
Matt Corgan 2013-02-23, 00:48
+
Nick Dimiduk 2013-02-23, 01:40
+
Matt Corgan 2013-02-23, 02:04
+
Stack 2013-02-26, 23:20
+
Stack 2013-02-26, 23:17
+
Ted 2013-02-22, 14:21
+
Stack 2013-02-26, 23:08