Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Coprocessor Increments


+
John Weatherford 2013-10-10, 01:43
+
Vladimir Rodionov 2013-10-10, 02:28
+
Ted Yu 2013-10-10, 02:39
+
Ted Yu 2013-10-10, 02:43
+
John Weatherford 2013-10-10, 03:26
+
Ted Yu 2013-10-10, 03:43
+
John Weatherford 2013-10-10, 04:03
+
Michael Segel 2013-10-10, 14:57
+
John Weatherford 2013-10-10, 16:15
+
Michael Segel 2013-10-10, 16:55
+
Vladimir Rodionov 2013-10-10, 19:09
+
Vladimir Rodionov 2013-10-10, 19:23
+
Tom Brown 2013-10-10, 20:20
+
Michael Segel 2013-10-10, 21:57
+
Vladimir Rodionov 2013-10-10, 23:52
+
Michael Segel 2013-10-11, 16:10
+
Vladimir Rodionov 2013-10-11, 17:26
+
Michael Segel 2013-10-11, 19:09
+
John Weatherford 2013-10-12, 01:06
+
Ted Yu 2013-10-12, 14:42
+
Michael Segel 2013-10-12, 23:04
+
anil gupta 2013-10-13, 04:27
+
Michael Segel 2013-10-13, 13:02
+
anil gupta 2013-10-13, 15:15
+
Michael Segel 2013-10-14, 14:50
+
Tom Brown 2013-10-14, 16:36
+
John Weatherford 2013-10-14, 20:39
+
anil gupta 2013-10-14, 22:25
+
Ted Yu 2013-10-14, 22:34
+
anil gupta 2013-10-15, 03:57
+
Michael Segel 2013-10-15, 18:12
Copy link to this message
-
Re: Coprocessor Increments
On Tue, Oct 15, 2013 at 11:12 AM, Michael Segel
<[EMAIL PROTECTED]>wrote:

> Anil,
> > Agree with you. But, as per my knowledge and experience with
> coprocessors,
> > they are meant to be used for operations that are local to RS. Otherwise,
> > you are in danger of running into deadlocks, scalability issues.
>
>
> I also did a quick look at…  HBASE-7474…
>
> You start with the assumption that all of your data is within a single
> region.
>
No, i dont. That sorting CP works even if your scan spans multiple RS's. I
do a merge sort at client side in that case. Please look at the code
closely. :)

>
> IMHO, this is a very narrow window for use cases.
>
> Most use cases have data that crosses region boundaries.
>
> From a design perspective… limiting the use case to only within region…
> kinda kills the reason for coprocessors to exist. Even looking back at the
> implementation by Google, they don't appear to have this problem… errr
> limitation.
>
> Sorry… IMHO and YMMV.
>
>
> On Oct 14, 2013, at 3:25 PM, anil gupta <[EMAIL PROTECTED]> wrote:
>
> > Inline.
> >
> >
> > On Mon, Oct 14, 2013 at 7:50 AM, Michael Segel <
> [EMAIL PROTECTED]>wrote:
> >
> >> Anil,
> >>
> >> I wasn't suggesting that you can't do what you're doing, but you end up
> >> running in to the risks which coprocessors are supposed to remove. The
> >> standard YMMV always applies.
> >>
> > Agree with you. But, as per my knowledge and experience with
> coprocessors,
> > they are meant to be used for operations that are local to RS. Otherwise,
> > you are in danger of running into deadlocks, scalability issues.
> >
> >>
> >> You have a cluster… another team in your company wants to use the
> cluster.
> >> So instead of the cluster being a single resource for your app/team, it
> now
> >> becomes a shared resource. So now you have people accessing HBase for
> >> multiple apps.
> >>
> > Well, its a separation of responsibility in this case. We don't want
> teams
> > to step each other toes and at the same time work well as an ecosystem.
> > Rule: Other teams can use same cluster. But they cannot write directly
> into
> > the tables that we own/control.  If they want to write into our tables
> then
> > they have to use our HBase Client.
> >
> >>
> >> You could then run multiple HBase HMasters with different locations for
> >> files, however… this can get messy.
> >> HOYA seems to suggest this as the future.  If so, then you have to
> wonder
> >> about data locality.
> >>
> > HOYA is not even in beta at present. So, right now we are not thinking
> > about it.
> >
> >>
> >> Having your app update the primary table and then the secondary index is
> >> always a good fallback, however you need to ensure that you understand
> the
> >> risks.
> >>
> > Agree, i understand that there is risk. But, you have to bite the bullet
> > when you are doing something that is not supported out of the box.  We
> also
> > use CP's wherever they are appropriate(like HBASE-7474).
> >
> >>
> >> With respect to secondary indexes… if you decouple the writes… you can
> get
> >> better throughput. Note that the code becomes a bit more complex because
> >> you're going to have to introduce a couple of different things.  But
> thats
> >> something for a different discussion…
> >>
> > Whether to use CP or not, depends on the use case. In my opinion, CP's
> are
> > really powerful and an awesome feature in HBase. But, sometimes if not
> used
> > properly(like creating a Cyclic Graph as per Tom's example), they might
> be
> > problematic.
> >
> >
> >>
> >> On Oct 13, 2013, at 10:15 AM, anil gupta <[EMAIL PROTECTED]> wrote:
> >>
> >>> Inline.
> >>>
> >>> On Sun, Oct 13, 2013 at 6:02 AM, Michael Segel <
> >> [EMAIL PROTECTED]>wrote:
> >>>
> >>>> Ok…
> >>>>
> >>>> Sure you can have your app update the secondary index table.
> >>>> The only issue with that is if someone updates the base table outside
> of
> >>>> your app,
> >>>> they may or may not increment the secondary index.
Thanks & Regards,
Anil Gupta