Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - timeouts with lots of coprocessor puts on single row


Copy link to this message
-
Re: timeouts with lots of coprocessor puts on single row
anil gupta 2013-08-27, 07:07
On Mon, Aug 26, 2013 at 10:56 PM, Olle Mårtensson <[EMAIL PROTECTED]
> wrote:

> Thank you for the link Anil it was a good explanation indeed.
>
> >It's not recommended to do put/deletes across
> >region servers like this.
>
> That was not my intention, I want to keep the region for the aggregates and
> the aggregated values on the same server. I read in the link that you gave
> me that I can achieve this by using coprocessor on the master, so I will
> try that out.
>
> >Try to move this aggregation on the client side
> >or at least outside RS.
>
> This is what I try to avoid since doing this would cause big data transfers
> between the client and the region server.
> The whole purpose of using the coprocessor is to push the aggregation work
> to the nodes where data is local and to minimize data transfer between the
> nodes.
>
> Why do you think it's a bad idea to do aggregate values inside of the
> regionserver, is it because it occupies RPC threads or because it's not a
> good usecase for coprocessors ?
>
I got the impression that your code is doing Inter-RS puts/gets from the
coprocessor.

> Do you think it's a bad idea even if I keep the regions for the two rows
> involved on the same regionserver and bypass RPC as the link suggests?
>
In my opinion, then it should be fine. I am not aware of how heavy/complex
your aggregations are. Obviously, more complex your CP(coprocessor) is,
more load you are putting on RS.

>
> Thanks // Olle
>
>
> On Mon, Aug 26, 2013 at 5:43 PM, anil gupta <[EMAIL PROTECTED]> wrote:
>
> > On Mon, Aug 26, 2013 at 7:27 AM, Olle Mårtensson
> > <[EMAIL PROTECTED]>wrote:
> >
> > > Hi,
> > >
> > > I have developed a coprocessor that is extending BaseRegionObserver and
> > > implements the
> > > postPut method. The postPut method scans the columns of the row that
> the
> > > put was issued on and calculates an aggregated based on these values,
> > when
> > > this is done a row in another table is updated with the aggregated
> value.
> > >
> > This is an anti-pattern. It's not recommended to do put/deletes across
> > region servers like this. Try to move this aggregation on the client side
> > or at least outside RS. Here is the link for much detailed explanation
> why
> > this is not good: http://search-hadoop.com/m/XtAi5Fogw32
> >
> > > This works out fine until I put some stress on one row, then the
> threads
> > on
> > > the regionserver hosting the table will freeze on flushing the put on
> the
> > > aggregated value.
> > > The client application basically do 100 concurrent puts on one row in a
> > > tight loop( on the table where the coprocessor is activated ).
> > > After that the client sleeps for a while and tries to fetch the
> > aggregated
> > > value and here the client freezes and periodically burps out
> exceptions.
> > > It works if I don't run so many put's in parallel.
> > >
> > > The HBASE environment is pseudo distributed 0.94.11 with one
> > regionserver.
> > >
> > > I have tried using a connection pool in the coprocessor, bumped up the
> > > heapsize of the regionServer and also to up the number of RPC threads
> for
> > > the regionserver but without luck.
> > >
> > > The pseudo code postPut would be something like this:
> > >
> > > vals = env.getRegion().get(get).getFamilyMap().values()
> > > agg_val = aggregate(vals)
> > > agg_table = env.getTable("aggregates")
> > > agg_table.setAutoFlush(false)
> > > put = new Put()
> > > put.add(agg_val)
> > > agg_table.put(put)
> > > agg_table.flushCommits()
> > > agg_table.close()
> > >
> > > And the real clojure variant is:
> > >
> > > https://gist.github.com/ollez/d0450930a591912aea5d#file-gistfile1-clj
> > >
> > > The hbase-site.xml:
> > >
> > > https://gist.github.com/ollez/d0450930a591912aea5d#file-hbase-site-xml
> > >
> > > The regionserver stacktrace:
> > >
> > >
> > >
> >
> https://gist.github.com/ollez/d0450930a591912aea5d#file-regionserver-stacktrace
> > >
> > > The client exceptions:
> > >
> > >
> >
> https://gist.github.com/ollez/d0450930a591912aea5d#file-client-exceptions

Thanks & Regards,
Anil Gupta