Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Client Get vs Coprocessor scan performance


+
Kiru Pakkirisamy 2013-08-09, 01:43
+
Ted Yu 2013-08-09, 03:40
+
Kiru Pakkirisamy 2013-08-09, 05:21
+
Wukang Lin 2013-08-09, 06:00
+
Kiru Pakkirisamy 2013-08-09, 07:05
+
Ted Yu 2013-08-09, 05:44
Copy link to this message
-
Re: Client Get vs Coprocessor scan performance
Ted, can you elaborate a little bit why this issue boosts performance?
I couldn't figure out from the issue comments if they execCoprocessor scans
the entire .META. table or and entire table, to understand the actual
improvement.

Thanks!
On Fri, Aug 9, 2013 at 8:44 AM, Ted Yu <[EMAIL PROTECTED]> wrote:

> I think you need HBASE-6870 which went into 0.94.8
>
> Upgrading should boost coprocessor performance.
>
> Cheers
>
> On Aug 8, 2013, at 10:21 PM, Kiru Pakkirisamy <[EMAIL PROTECTED]>
> wrote:
>
> > Ted,
> > Here is the method signature/protocol
> > public Map<String, Double> getFooMap<String, Double> input,
> > int topN) throws IOException;
> >
> > There are 31 regions on 4 nodes X 8 CPU.
> > I am on 0.94.6 (from Hortonworks).
> > I think it seems to behave like what linwukang says, - it is almost a
> full table scan in the coprocessor.
> > Actually, when I set more specific ColumnPrefixFilters performance went
> down.
> > I want to do things on the server side because, I dont want to be
> sending 500K column/values to the client.
> > I cannot believe a single-threaded client which does some calculations
> and group-by  beats the coprocessor running in 31 regions.
> >
> > Regards,
> > - kiru
> >
> >
> > Kiru Pakkirisamy | webcloudtech.wordpress.com
> >
> >
> > ________________________________
> > From: Ted Yu <[EMAIL PROTECTED]>
> > To: [EMAIL PROTECTED]; Kiru Pakkirisamy <[EMAIL PROTECTED]>
> > Sent: Thursday, August 8, 2013 8:40 PM
> > Subject: Re: Client Get vs Coprocessor scan performance
> >
> >
> > Can you give us a bit more information ?
> >
> > How do you deliver the 55 rowkeys to your endpoint ?
> > How many regions do you have for this table ?
> >
> > What HBase version are you using ?
> >
> > Thanks
> >
> > On Thu, Aug 8, 2013 at 6:43 PM, Kiru Pakkirisamy
> > <[EMAIL PROTECTED]>wrote:
> >
> >> Hi,
> >> I am finding an odd behavior with the Coprocessor performance lagging a
> >> client side Get.
> >> I have a table with 500000 rows. Each have variable # of columns in one
> >> column family (in this case about 600000 columns in total are processed)
> >> When I try to get specific 55 rows, the client side completes in
> half-the
> >> time as the coprocessor endpoint.
> >> I am using  55 RowFilters on the Coprocessor scan side. The rows are
> >> processed are exactly the same way in both the cases.
> >> Any pointers on how to debug this scenario ?
> >>
> >> Regards,
> >> - kiru
> >>
> >>
> >> Kiru Pakkirisamy | webcloudtech.wordpress.com
>
+
Ted Yu 2013-08-17, 23:19
+
Kiru Pakkirisamy 2013-08-18, 05:34
+
Ted Yu 2013-08-18, 13:39
+
Kiru Pakkirisamy 2013-08-18, 18:59
+
James Taylor 2013-08-18, 18:44
+
Kiru Pakkirisamy 2013-08-18, 19:16
+
James Taylor 2013-08-18, 21:07
+
Kiru Pakkirisamy 2013-08-18, 21:16
+
James Taylor 2013-08-19, 00:34
+
Kiru Pakkirisamy 2013-08-19, 08:36
+
James Taylor 2013-08-19, 15:34
+
Kiru Pakkirisamy 2013-08-09, 05:58
+
Kiru Pakkirisamy 2013-08-09, 20:04
+
Kiru Pakkirisamy 2013-08-11, 06:15
+
James Taylor 2013-08-12, 16:41
+
Kiru Pakkirisamy 2013-08-12, 18:27
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB