Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> coprocessor enabled put very slow, help please~~~


Copy link to this message
-
Re: coprocessor enabled put very slow, help please~~~
I don't agree with Lars on the second half of his statement.

Yes, there will be a performance hit when you go across regions because you're now going across the network to a second machine.
However, I disagree that it defeats the performance purpose.

In a Hadoop cluster, we tend to launch our jobs from an edge server. However w HBase, you can connect to the cluster from a remote client and still run queries against the data outside of the traditional M/R.

So doing something inner cluster would be less expensive than doing something round trip back to the client.

In addition, there is no concept of a transaction. All put()s are atomic. So you write to your base table, one atomic write. You write to your index(s) table(s) each index update is atomic. (Assuming you may have multiple indexes on your base table.

Its important to remember that coprocessors are really, really new. As Andrew points out... its not recommended for the novice.
On Feb 17, 2013, at 8:31 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> The main advantage of coprocessors is that they keep the logic local to the region server. Putting data into other region servers is supported, but defeats the performance purpose.
>
>
>
> ________________________________
> From: Prakash Kadel <[EMAIL PROTECTED]>
> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> Sent: Sunday, February 17, 2013 5:26 PM
> Subject: Re: coprocessor enabled put very slow, help please~~~
>
> thanks again,
>   i did try making indexes with the MR. dont have exact evaluation data, but inserting indexes directly with mapreduce does seem to be much much faster than making the indexes with the coprocessors. guess i am missing the point about the coprosessors.
> my reason for trying out the coprocessor was to make the insertion code cleaner and efficient index creation.
>
> Sincerely,
> Prakash Kadel
>
> On Feb 18, 2013, at 10:17 AM, lars hofhansl <[EMAIL PROTECTED]> wrote:
>
>> Index maintenance will always be slower. An interesting comparison would be to also update your indexes from the M/R and see whether that performs better.
>>
>>
>>
>> ________________________________
>> From: Prakash Kadel <[EMAIL PROTECTED]>
>> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
>> Sent: Sunday, February 17, 2013 5:13 PM
>> Subject: Re: coprocessor enabled put very slow, help please~~~
>>
>> thank you lars,
>> That is my guess too. I am confused, isnt that something that cannot be controlled. Is this approach of creating some kind of index wrong?
>>
>> Sincerely,
>> Prakash Kadel
>>
>> On Feb 18, 2013, at 10:07 AM, lars hofhansl <[EMAIL PROTECTED]> wrote:
>>
>>> Presumably the coprocessor issues Puts to another region server in most cases, that could explain it being (much) slower.
>>>
>>>
>>>
>>> ________________________________
>>> From: Prakash Kadel <[EMAIL PROTECTED]>
>>> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
>>> Sent: Sunday, February 17, 2013 4:52 PM
>>> Subject: Re: coprocessor enabled put very slow, help please~~~
>>>
>>> Forgot to mention. I am using 0.92.
>>>
>>> Sincerely,
>>> Prakash
>>>
>>> On Feb 18, 2013, at 9:48 AM, Prakash Kadel <[EMAIL PROTECTED]> wrote:
>>>
>>>> hi,
>>>>      i am trying to insert few million documents to hbase with mapreduce. To enable quick search of docs i want to have some indexes, so i tried to use the coprocessors, but they are slowing down my inserts. Arent the coprocessors not supposed to increase the latency?
>>>> my settings:
>>>>       3 region servers
>>>>      60 maps
>>>> each map inserts to doc table.(checkAndPut)
>>>> regionobserver coprocessor does a postCheckAndPut and inserts some rows to a index table.
>>>>
>>>>
>>>> Sincerely,
>>>> Prakash

Michael Segel  | (m) 312.755.9623

Segel and Associates
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB