Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Coprocessor / threading model


Copy link to this message
-
Re: Coprocessor / threading model
You should look at the jstack - I think HTablePool is the reason for the
large number of threads. Note that HTablePool is a reusable pool HTable(s)
and each HTable consists of an ExecutorService containing 1 thread by
default. Are you closing the HTable you obtain from HTablePool - if you are
not closing the HTable - that will incessantly increase your thread count.
Also on 64 bit machines, I think each thread is allocated 256K or 512K of
stack by default.

Varun

On Tue, Jan 15, 2013 at 10:44 AM, Wei Tan <[EMAIL PROTECTED]> wrote:

> Andrew, could you explain more, why doing cross-table operation is an
> anti-pattern of using CP?
> Durability might be an issue, as far as I understand. Thanks,
>
>
> Best Regards,
> Wei
>
>
>
>
> From:   Andrew Purtell <[EMAIL PROTECTED]>
> To:     "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>,
> Date:   01/12/2013 09:39 PM
> Subject:        Re: Coprocessor / threading model
>
>
>
> > In pre-put, I trigger another Put() in an external table (to build the
> secondary index).
>
> We should probably call this a Coprocessor anti-pattern.
>
> Coprocessors are meant to operate on the region to which they are
> associated. They are a way you can extend HBase function while it operates
> in region on data for the region. Think of them as loadable kernel
> modules.
> They are not a general purpose server side platform for programming as if
> you are building a HBase client (with HTable, etc.). Just because you can
> do this doesn't mean you should.
>
>
> On Sat, Jan 12, 2013 at 4:06 PM, Adrien Mogenet
> <[EMAIL PROTECTED]>wrote:
>
> > Hi there,
> >
> > I'm experiencing some issues with CP. I'm trying to implement an
> indexing
> > solution (inspired by Annop's slides). In pre-put, I trigger another
> Put()
> > in an external table (to build the secondary index). It works perfect
> for
> > one client, but when I'm inserting data from 2 separate clients, I met
> > issues with HTable object (the one used in pre-Put()), because it's not
> > thread-safe. I decided to move on TablePool and that fixed my issue.
> >
> > But if I increase the write-load (and concurrency) HBase is throwing a
> OOM
> > exception because it can't create new native threads. Looking at HBase
> > metrics "threads count", I see that roughly 3500 threads are created.
> >
> > I'm looking for documentation about how CPs are working with threads :
> > what/when should I protect against concurrency issues ? How may I solve
> my
> > issue ?
> >
> > Help is welcome :-)
> >
> > --
> > Adrien Mogenet
> > 06.59.16.64.22
> > http://www.mogenet.me
> >
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB