Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> coprocessor enabled put very slow, help please~~~


Copy link to this message
-
Re: coprocessor enabled put very slow, help please~~~
A coprocessor is some code running in a server process. The resources
available and rules of the road are different from client side programming.
HTablePool (and HTable in general) is problematic for server side
programming in my opinion: http://search-hadoop.com/m/XtAi5Fogw32 Since
this comes up now and again seems like a lightweight alternative for server
side IPC could be useful.
On Tue, Feb 19, 2013 at 7:15 AM, Wei Tan <[EMAIL PROTECTED]> wrote:

> A side question: if HTablePool is not encouraged to be used... how we
> handle the thread safeness in using HTable? Any replacement for
> HTablePool, in plan?
> Thanks,
>
>
> Best Regards,
> Wei
>
>
>
>
> From:   Michel Segel <[EMAIL PROTECTED]>
> To:     "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>,
> Date:   02/18/2013 09:23 AM
> Subject:        Re: coprocessor enabled put very slow, help please~~~
>
>
>
> Why are you using an HTable Pool?
> Why are you closing the table after each iteration through?
>
> Try using 1 HTable object. Turn off WAL
> Initiate in start()
> Close in Stop()
> Surround the use in a try / catch
> If exception caught, re instantiate new HTable connection.
>
> Maybe want to flush the connection after puts.
>
>
> Again not sure why you are using check and put on the base table. Your
> count could be off.
>
> As an example look at poem/rhyme 'Marry had a little lamb'.
> Then check your word count.
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On Feb 18, 2013, at 7:21 AM, prakash kadel <[EMAIL PROTECTED]>
> wrote:
>
> > Thank you guys for your replies,
> > Michael,
> >   I think i didnt make it clear. Here is my use case,
> >
> > I have text documents to insert in the hbase. (With possible duplicates)
> > Suppose i have a document as : " I am working. He is not working"
> >
> > I want to insert this document to a table in hbase, say table "doc"
> >
> > =doc table> > -----
> > rowKey : doc_id
> > cf: doc_content
> > value: "I am working. He is not working"
> >
> > Now, i to create another table that stores the word count, say "doc_idx"
> >
> > doc_idx table
> > ---
> > rowKey : I, cf: count, value: 1
> > rowKey : am, cf: count, value: 1
> > rowKey : working, cf: count, value: 2
> > rowKey : He, cf: count, value: 1
> > rowKey : is, cf: count, value: 1
> > rowKey : not, cf: count, value: 1
> >
> > My MR job code:
> > =============> >
> > if(doc.checkAndPut(rowKey, doc_content, "", null, putDoc)) {
> >    for(String word : doc_content.split("\\s+")) {
> >       Increment inc = new Increment(Bytes.toBytes(word));
> >       inc.addColumn("count", "", 1);
> >    }
> > }
> >
> > Now, i wanted to do some experiments with coprocessors. So, i modified
> > the code as follows.
> >
> > My MR job code:
> > ==============> >
> > doc.checkAndPut(rowKey, doc_content, "", null, putDoc);
> >
> > Coprocessor code:
> > ==============> >
> >    public void start(CoprocessorEnvironment env)  {
> >        pool = new HTablePool(conf, 100);
> >    }
> >
> >    public boolean postCheckAndPut(c,  row,  family, byte[] qualifier,
> > compareOp,     comparator,  put, result) {
> >
> >                if(!result) return true; // check if the put succeeded
> >
> >        HTableInterface table_idx = pool.getTable("doc_idx");
> >
> >        try {
> >
> >            for(KeyValue contentKV = put.get("doc_content", "")) {
> >                            for(String word :
> > contentKV.getValue().split("\\s+")) {
> >                                Increment inc = new
> > Increment(Bytes.toBytes(word));
> >                                inc.addColumn("count", "", 1);
> >                                table_idx.increment(inc);
> >                            }
> >                       }
> >        } finally {
> >            table_idx.close();
> >        }
> >        return true;
> >    }
> >
> >    public void stop(env) {
> >        pool.close();
> >    }
> >
> > I am a newbee to HBASE. I am not sure this is the way to do.
> > Given that, why is the cooprocessor enabled version much slower than
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)