Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Coprocessor Increments


+
John Weatherford 2013-10-10, 01:43
+
Vladimir Rodionov 2013-10-10, 02:28
+
Ted Yu 2013-10-10, 02:39
+
Ted Yu 2013-10-10, 02:43
+
John Weatherford 2013-10-10, 03:26
+
Ted Yu 2013-10-10, 03:43
+
John Weatherford 2013-10-10, 04:03
+
Michael Segel 2013-10-10, 14:57
+
John Weatherford 2013-10-10, 16:15
+
Michael Segel 2013-10-10, 16:55
+
Vladimir Rodionov 2013-10-10, 19:09
+
Vladimir Rodionov 2013-10-10, 19:23
+
Tom Brown 2013-10-10, 20:20
+
Michael Segel 2013-10-10, 21:57
+
Vladimir Rodionov 2013-10-10, 23:52
+
Michael Segel 2013-10-11, 16:10
+
Vladimir Rodionov 2013-10-11, 17:26
+
Michael Segel 2013-10-11, 19:09
+
John Weatherford 2013-10-12, 01:06
+
Ted Yu 2013-10-12, 14:42
+
Michael Segel 2013-10-12, 23:04
+
anil gupta 2013-10-13, 04:27
+
Michael Segel 2013-10-13, 13:02
+
anil gupta 2013-10-13, 15:15
+
Michael Segel 2013-10-14, 14:50
+
Tom Brown 2013-10-14, 16:36
+
John Weatherford 2013-10-14, 20:39
+
anil gupta 2013-10-14, 22:25
+
Ted Yu 2013-10-14, 22:34
+
anil gupta 2013-10-15, 03:57
+
Michael Segel 2013-10-15, 18:12
Copy link to this message
-
Re: Coprocessor Increments
On Tue, Oct 15, 2013 at 11:12 AM, Michael Segel
<[EMAIL PROTECTED]>wrote:

> Anil,
> > Agree with you. But, as per my knowledge and experience with
> coprocessors,
> > they are meant to be used for operations that are local to RS. Otherwise,
> > you are in danger of running into deadlocks, scalability issues.
>
>
> I also did a quick look at…  HBASE-7474…
>
> You start with the assumption that all of your data is within a single
> region.
>
No, i dont. That sorting CP works even if your scan spans multiple RS's. I
do a merge sort at client side in that case. Please look at the code
closely. :)

>
> IMHO, this is a very narrow window for use cases.
>
> Most use cases have data that crosses region boundaries.
>
> From a design perspective… limiting the use case to only within region…
> kinda kills the reason for coprocessors to exist. Even looking back at the
> implementation by Google, they don't appear to have this problem… errr
> limitation.
>
> Sorry… IMHO and YMMV.
>
>
> On Oct 14, 2013, at 3:25 PM, anil gupta <[EMAIL PROTECTED]> wrote:
>
> > Inline.
> >
> >
> > On Mon, Oct 14, 2013 at 7:50 AM, Michael Segel <
> [EMAIL PROTECTED]>wrote:
> >
> >> Anil,
> >>
> >> I wasn't suggesting that you can't do what you're doing, but you end up
> >> running in to the risks which coprocessors are supposed to remove. The
> >> standard YMMV always applies.
> >>
> > Agree with you. But, as per my knowledge and experience with
> coprocessors,
> > they are meant to be used for operations that are local to RS. Otherwise,
> > you are in danger of running into deadlocks, scalability issues.
> >
> >>
> >> You have a cluster… another team in your company wants to use the
> cluster.
> >> So instead of the cluster being a single resource for your app/team, it
> now
> >> becomes a shared resource. So now you have people accessing HBase for
> >> multiple apps.
> >>
> > Well, its a separation of responsibility in this case. We don't want
> teams
> > to step each other toes and at the same time work well as an ecosystem.
> > Rule: Other teams can use same cluster. But they cannot write directly
> into
> > the tables that we own/control.  If they want to write into our tables
> then
> > they have to use our HBase Client.
> >
> >>
> >> You could then run multiple HBase HMasters with different locations for
> >> files, however… this can get messy.
> >> HOYA seems to suggest this as the future.  If so, then you have to
> wonder
> >> about data locality.
> >>
> > HOYA is not even in beta at present. So, right now we are not thinking
> > about it.
> >
> >>
> >> Having your app update the primary table and then the secondary index is
> >> always a good fallback, however you need to ensure that you understand
> the
> >> risks.
> >>
> > Agree, i understand that there is risk. But, you have to bite the bullet
> > when you are doing something that is not supported out of the box.  We
> also
> > use CP's wherever they are appropriate(like HBASE-7474).
> >
> >>
> >> With respect to secondary indexes… if you decouple the writes… you can
> get
> >> better throughput. Note that the code becomes a bit more complex because
> >> you're going to have to introduce a couple of different things.  But
> thats
> >> something for a different discussion…
> >>
> > Whether to use CP or not, depends on the use case. In my opinion, CP's
> are
> > really powerful and an awesome feature in HBase. But, sometimes if not
> used
> > properly(like creating a Cyclic Graph as per Tom's example), they might
> be
> > problematic.
> >
> >
> >>
> >> On Oct 13, 2013, at 10:15 AM, anil gupta <[EMAIL PROTECTED]> wrote:
> >>
> >>> Inline.
> >>>
> >>> On Sun, Oct 13, 2013 at 6:02 AM, Michael Segel <
> >> [EMAIL PROTECTED]>wrote:
> >>>
> >>>> Ok…
> >>>>
> >>>> Sure you can have your app update the secondary index table.
> >>>> The only issue with that is if someone updates the base table outside
> of
> >>>> your app,
> >>>> they may or may not increment the secondary index.
Thanks & Regards,
Anil Gupta
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB