Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> coprocessor enabled put very slow, help please~~~


Copy link to this message
-
Re: coprocessor enabled put very slow, help please~~~
Good question..

You create a class MyRO.

How many instances of  MyRO exist per RS?

How many queries can access the instance MyRO at the same time?
On Feb 19, 2013, at 9:15 AM, Wei Tan <[EMAIL PROTECTED]> wrote:

> A side question: if HTablePool is not encouraged to be used... how we
> handle the thread safeness in using HTable? Any replacement for
> HTablePool, in plan?
> Thanks,
>
>
> Best Regards,
> Wei
>
>
>
>
> From:   Michel Segel <[EMAIL PROTECTED]>
> To:     "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>,
> Date:   02/18/2013 09:23 AM
> Subject:        Re: coprocessor enabled put very slow, help please~~~
>
>
>
> Why are you using an HTable Pool?
> Why are you closing the table after each iteration through?
>
> Try using 1 HTable object. Turn off WAL
> Initiate in start()
> Close in Stop()
> Surround the use in a try / catch
> If exception caught, re instantiate new HTable connection.
>
> Maybe want to flush the connection after puts.
>
>
> Again not sure why you are using check and put on the base table. Your
> count could be off.
>
> As an example look at poem/rhyme 'Marry had a little lamb'.
> Then check your word count.
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On Feb 18, 2013, at 7:21 AM, prakash kadel <[EMAIL PROTECTED]>
> wrote:
>
>> Thank you guys for your replies,
>> Michael,
>>  I think i didnt make it clear. Here is my use case,
>>
>> I have text documents to insert in the hbase. (With possible duplicates)
>> Suppose i have a document as : " I am working. He is not working"
>>
>> I want to insert this document to a table in hbase, say table "doc"
>>
>> =doc table>> -----
>> rowKey : doc_id
>> cf: doc_content
>> value: "I am working. He is not working"
>>
>> Now, i to create another table that stores the word count, say "doc_idx"
>>
>> doc_idx table
>> ---
>> rowKey : I, cf: count, value: 1
>> rowKey : am, cf: count, value: 1
>> rowKey : working, cf: count, value: 2
>> rowKey : He, cf: count, value: 1
>> rowKey : is, cf: count, value: 1
>> rowKey : not, cf: count, value: 1
>>
>> My MR job code:
>> =============>>
>> if(doc.checkAndPut(rowKey, doc_content, "", null, putDoc)) {
>>   for(String word : doc_content.split("\\s+")) {
>>      Increment inc = new Increment(Bytes.toBytes(word));
>>      inc.addColumn("count", "", 1);
>>   }
>> }
>>
>> Now, i wanted to do some experiments with coprocessors. So, i modified
>> the code as follows.
>>
>> My MR job code:
>> ==============>>
>> doc.checkAndPut(rowKey, doc_content, "", null, putDoc);
>>
>> Coprocessor code:
>> ==============>>
>>   public void start(CoprocessorEnvironment env)  {
>>       pool = new HTablePool(conf, 100);
>>   }
>>
>>   public boolean postCheckAndPut(c,  row,  family, byte[] qualifier,
>> compareOp,     comparator,  put, result) {
>>
>>               if(!result) return true; // check if the put succeeded
>>
>>       HTableInterface table_idx = pool.getTable("doc_idx");
>>
>>       try {
>>
>>           for(KeyValue contentKV = put.get("doc_content", "")) {
>>                           for(String word :
>> contentKV.getValue().split("\\s+")) {
>>                               Increment inc = new
>> Increment(Bytes.toBytes(word));
>>                               inc.addColumn("count", "", 1);
>>                               table_idx.increment(inc);
>>                           }
>>                      }
>>       } finally {
>>           table_idx.close();
>>       }
>>       return true;
>>   }
>>
>>   public void stop(env) {
>>       pool.close();
>>   }
>>
>> I am a newbee to HBASE. I am not sure this is the way to do.
>> Given that, why is the cooprocessor enabled version much slower than
>> the one without?
>>
>>
>> Sincerely,
>> Prakash Kadel
>>
>>
>> On Mon, Feb 18, 2013 at 9:11 PM, Michael Segel
>> <[EMAIL PROTECTED]> wrote:
>>>
>>> The  issue I was talking about was the use of a check and put.
>>> The OP wrote:
>>>>>>> each map inserts to doc table.(checkAndPut)
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB