Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - Coprocessor / threading model


+
Adrien Mogenet 2013-01-13, 00:06
+
Andrew Purtell 2013-01-13, 02:39
+
Ted Yu 2013-01-13, 02:48
+
Andrew Purtell 2013-01-13, 03:58
+
ramkrishna vasudevan 2013-01-13, 10:04
+
Adrien Mogenet 2013-01-13, 10:42
+
Anoop John 2013-01-13, 16:12
+
Wei Tan 2013-01-15, 18:44
Copy link to this message
-
Re: Coprocessor / threading model
Varun Sharma 2013-01-15, 18:56
You should look at the jstack - I think HTablePool is the reason for the
large number of threads. Note that HTablePool is a reusable pool HTable(s)
and each HTable consists of an ExecutorService containing 1 thread by
default. Are you closing the HTable you obtain from HTablePool - if you are
not closing the HTable - that will incessantly increase your thread count.
Also on 64 bit machines, I think each thread is allocated 256K or 512K of
stack by default.

Varun

On Tue, Jan 15, 2013 at 10:44 AM, Wei Tan <[EMAIL PROTECTED]> wrote:

> Andrew, could you explain more, why doing cross-table operation is an
> anti-pattern of using CP?
> Durability might be an issue, as far as I understand. Thanks,
>
>
> Best Regards,
> Wei
>
>
>
>
> From:   Andrew Purtell <[EMAIL PROTECTED]>
> To:     "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>,
> Date:   01/12/2013 09:39 PM
> Subject:        Re: Coprocessor / threading model
>
>
>
> > In pre-put, I trigger another Put() in an external table (to build the
> secondary index).
>
> We should probably call this a Coprocessor anti-pattern.
>
> Coprocessors are meant to operate on the region to which they are
> associated. They are a way you can extend HBase function while it operates
> in region on data for the region. Think of them as loadable kernel
> modules.
> They are not a general purpose server side platform for programming as if
> you are building a HBase client (with HTable, etc.). Just because you can
> do this doesn't mean you should.
>
>
> On Sat, Jan 12, 2013 at 4:06 PM, Adrien Mogenet
> <[EMAIL PROTECTED]>wrote:
>
> > Hi there,
> >
> > I'm experiencing some issues with CP. I'm trying to implement an
> indexing
> > solution (inspired by Annop's slides). In pre-put, I trigger another
> Put()
> > in an external table (to build the secondary index). It works perfect
> for
> > one client, but when I'm inserting data from 2 separate clients, I met
> > issues with HTable object (the one used in pre-Put()), because it's not
> > thread-safe. I decided to move on TablePool and that fixed my issue.
> >
> > But if I increase the write-load (and concurrency) HBase is throwing a
> OOM
> > exception because it can't create new native threads. Looking at HBase
> > metrics "threads count", I see that roughly 3500 threads are created.
> >
> > I'm looking for documentation about how CPs are working with threads :
> > what/when should I protect against concurrency issues ? How may I solve
> my
> > issue ?
> >
> > Help is welcome :-)
> >
> > --
> > Adrien Mogenet
> > 06.59.16.64.22
> > http://www.mogenet.me
> >
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>
>
+
Andrew Purtell 2013-01-15, 19:20
+
Wei Tan 2013-01-15, 22:41
+
Anoop Sam John 2013-01-16, 04:39
+
Ted 2013-01-13, 01:38
+
anil gupta 2013-01-13, 02:30
+
Michel Segel 2013-01-13, 13:25