Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> [UPDATE] Finishing up 0.96 --> WAS Re: 0.95 and 0.96 remaining issues


Copy link to this message
-
Re: [UPDATE] Finishing up 0.96 --> WAS Re: 0.95 and 0.96 remaining issues
IMHO, the value is agreeing on a serialization format across multiple
products. Without a common serialization format, we won't have a good
interop story. Phoenix needs the serialization format to be order
preserving.

Given that nothing depends on the order preserving types (i.e. no risk of
breaking stuff), why not just include them?
On Wed, Jul 31, 2013 at 1:34 PM, Nick Dimiduk <[EMAIL PROTECTED]> wrote:

> On Wed, Jul 31, 2013 at 1:19 PM, James Taylor <[EMAIL PROTECTED]
> >wrote:
>
> > But the value in your patch is fixing the serialization format such that
> it
> > is order preserving. Unfortunately, without this, Phoenix can't adopt it.
> > It's existing type system and query processing is predicated on this.
> >
>
> Two patches, two value propositions. Providing a data type api with some
> pre-made implementations that users can use and that external projects can
> standardize on is value of itself. Phoenix can extend this API to provide
> it's own encodings, but I agree it provides in HBase something Phoenix has
> already worked out for itself. The biggest win here is that two consumers
> of HBase can agree on precisely what they mean when they say they encode a
> value.
>
> The second piece is the order-preserving encoding scheme. Having HBase ship
> a single scheme that can be used across the board has much wider utility.
> Delivering it through the API described previously is practical. Lacking
> this, Phoenix can still plug it's existing encoding code into the data type
> API, as I described in another email.
>
> I want to see them both shipped. Breaking it down like this was a way to
> allow for prudent concessions considering the timelines.
>
>  On Wed, Jul 31, 2013 at 12:04 PM, Nick Dimiduk <[EMAIL PROTECTED]>
> wrote:
> >
> > > On Wed, Jul 31, 2013 at 10:31 AM, Stack <[EMAIL PROTECTED]> wrote:
> > >
> > > > So what would be the incentive using the new API be?
> > > >
> > >
> > > Hopefully the new API is nicer than managing byte[]'s on manually. The
> > only
> > > incentive for users would be keeping up with progress, giving users the
> > > chance to start migrating their applications. For the external tools,
> I'm
> > > looking forward to using this to make defining Hive tables over HBase
> > > nicer. The current column mapping stuff is clunky and this API gives a
> > much
> > > improved mechanism for declaring column types. I can't do that without
> an
> > > API shipping with HBase. Maybe Elliott can weigh in on the Imapala
> side,
> > > James on Phoenix, Bueller from Kiji?
> > >
> > > And then when the implementation changes -- it serializes in sort-order
> > --
> > > > will it confuse?
> > > >
> > >
> > > Let's continue my Hive example. Assuming DataType (9091 + 8694) ships
> in
> > > 0.96, Hive gets plumbed, and users get to start defining their tables
> in
> > > terms of LegacyInteger, LegacyBytesFixedWidth, and Struct. When the
> > > OrderedBytes patch (8201) comes in with it's type implementations,
> users
> > at
> > > their leisure can drop the new types in when they're ready to
> transition.
> > > The Ordered* types don't replace the Legacy* types, they augment the
> > > catalog of types that HBase provides.
> > >
> >
>