Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Design review: Secondary index support through coprocessors


Copy link to this message
-
Re: Design review: Secondary index support through coprocess
Yep. That's my concern too. Would need to configure a generous number of handlers to prevent this from happening.

________________________________
 From: Vladimir Rodionov <[EMAIL PROTECTED]>
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Sent: Monday, January 20, 2014 11:57 AM
Subject: RE: Design review: Secondary index support through coprocess
 

>>Yes, the coprocessors potentially cross RS boundaries.

The open path to the disaster. Inter region RPCs in coprocessors may result in periodic cluster - wide deadlocks
Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
e-mail: [EMAIL PROTECTED]

________________________________________

From: James Taylor [[EMAIL PROTECTED]]
Sent: Monday, January 20, 2014 11:39 AM
To: [EMAIL PROTECTED]
Subject: Re: Design review: Secondary index support through coprocess

Yes, the coprocessors potentially cross RS boundaries. No, the index is not
co-located with the main table. Take a look at the link I sent as that
should be able to answer a lot of questions.

Thanks,
James
On Mon, Jan 20, 2014 at 11:03 AM, Michael Segel
<[EMAIL PROTECTED]>wrote:

> James,
>
> Ok…
>
> Its been a while since we talked about this…
>
> While the index is in a separate table, is that table being split and
> collocated with the main table?
>
> If you’re using the coprocessor to maintain the index, that would imply
> you’re crossing RS boundaries if your index is truly orthogonal.
>
> Is this what you’re doing?
>
> On Jan 20, 2014, at 11:32 AM, James Taylor <[EMAIL PROTECTED]> wrote:
>
> > Mike,
> > Yes, you're mistaken:
> > - secondary indexes in Phoenix are orthogonal to the base table. They're
> in
> > a separate table (
> > http://phoenix.incubator.apache.org/secondary_indexing.html).
> > - Phoenix has joins. They're in our master branch with a release
> scheduled
> > for next month
> > - numeric strings? Not a use case for indexing numeric data? Have you
> ever
> > seen a number used as an ID?
> > Thanks,
> > James
> >
> >
> > On Mon, Jan 20, 2014 at 8:50 AM, Michael Segel <
> [EMAIL PROTECTED]>wrote:
> >
> >> Indexes tend to be orthogonal to the base table, not to mention if
> you’re
> >> using an inverted table for an index, your index table would be much
> >> thinner than your base table.
> >>
> >> Having said that, the solution proposed by Yu, Taylor and others only
> >> works if you want to use the index to help on server side filtering and
> >> misses the boat on the larger and broader picture of improving query
> >> optimization and joins.
> >>
> >> HINT: Unless I am mistaken… until you treat the index as orthogonal to
> the
> >> base table, you will always lag performance of traditional MPP DWs like
> >> Informix XPS. (Now part of IBM’s IM pillar )
> >>
> >> In addition, until you fix coprocessors in general, you will have
> >> scalability and performance issues.
> >> (Note that you can write a coprocessor to create a sandbox and separate
> >> the co-process from the RS jvm, however it would be better if it were
> part
> >> of the underlying coprocessor code. )
> >>
> >> The current implementation makes joins worthless.
> >> (Note that in prior discussions,  Phoenix doesn’t do joins…)
> >> Here’s why:
> >> In order to do a join, if you use the proposed index, you have to first
> >> reduce each index in to a single, sort ordered set.  Then you can take
> the
> >> intersection of the index result sets.  The final set would be in sort
> >> order and a subset of the total rows. You can then fetch the rows and
> still
> >> do a server side filter before returning the ultimate result set.
> >>
> >> Its that first step of reducing each result set in to a single sort
> >> ordered set that takes a lot of effort.
> >>
> >>
> >> On a side note…. there’s been some mention of ordering floats. Again,
> just
> >> a word of caution… there isn’t a really strong use case for indexing
> >> numeric data types. period.  And to be very, very clear, there is a

Confidentiality Notice:  The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited.  If you have received this message in error, please immediately notify the sender and/or [EMAIL PROTECTED] and delete or destroy any copy of this message and its attachments.