Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Hbase Performance Issue


+
Akhtar Muhammad Din 2014-01-04, 20:17
+
Ted Yu 2014-01-04, 20:24
+
Akhtar Muhammad Din 2014-01-04, 20:44
+
Vladimir Rodionov 2014-01-04, 20:55
+
Ted Yu 2014-01-04, 21:00
+
Kevin Odell 2014-01-04, 21:19
+
Akhtar Muhammad Din 2014-01-04, 21:34
+
Ted Yu 2014-01-04, 22:33
+
Vladimir Rodionov 2014-01-05, 01:12
+
Nicolas Liochon 2014-01-06, 10:45
+
Doug Meil 2014-01-06, 19:14
+
Mike Axiak 2014-01-06, 19:42
+
Suraj Varma 2014-01-07, 23:53
+
Akhtar Muhammad Din 2014-01-09, 20:27
Copy link to this message
-
Re: Hbase Performance Issue
Could you give us a region server log to look at during a job?
On Jan 4, 2014 4:35 PM, "Akhtar Muhammad Din" <[EMAIL PROTECTED]> wrote:

> Thanks guys for your precious time.
> Vladimir, as Ted rightly said i want to improve write performance currently
> (of course i want to read data as fast as possible later on)
> Kevin, my current understanding of bulk load is that you generate
> StoreFiles and later load through a command line program. I dont want to do
> any manual step. Our system is getting data after every 15 minutes, so
> requirement is to automate it through client API completely.
>
>
>
> On Sun, Jan 5, 2014 at 2:19 AM, Kevin O'dell <[EMAIL PROTECTED]
> >wrote:
>
> > Have you tried writing out an hfile and then bulk loading the data?
> > On Jan 4, 2014 4:01 PM, "Ted Yu" <[EMAIL PROTECTED]> wrote:
> >
> > > bq. Output is written to either Hbase
> > >
> > > Looks like Akhtar wants to boost write performance to HBase.
> > > MapReduce over snapshot files targets higher read throughput.
> > >
> > > Cheers
> > >
> > >
> > > On Sat, Jan 4, 2014 at 12:55 PM, Vladimir Rodionov
> > > <[EMAIL PROTECTED]>wrote:
> > >
> > > > You cay try MapReduce over snapshot files
> > > > https://issues.apache.org/jira/browse/HBASE-8369
> > > >
> > > > but you will need to patch 0.94.
> > > >
> > > > Best regards,
> > > > Vladimir Rodionov
> > > > Principal Platform Engineer
> > > > Carrier IQ, www.carrieriq.com
> > > > e-mail: [EMAIL PROTECTED]
> > > >
> > > > ________________________________________
> > > > From: Akhtar Muhammad Din [[EMAIL PROTECTED]]
> > > > Sent: Saturday, January 04, 2014 12:44 PM
> > > > To: [EMAIL PROTECTED]
> > > > Subject: Re: Hbase Performance Issue
> > > >
> > > > im  using CDH 4.5:
> > > > Hadoop:  2.0.0-cdh4.5.0
> > > > HBase:   0.94.6-cdh4.5.0
> > > >
> > > > Regards
> > > >
> > > >
> > > > On Sun, Jan 5, 2014 at 1:24 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
> > > >
> > > > > What version of HBase / hdfs are you running with ?
> > > > >
> > > > > Cheers
> > > > >
> > > > >
> > > > >
> > > > > On Sat, Jan 4, 2014 at 12:17 PM, Akhtar Muhammad Din
> > > > > <[EMAIL PROTECTED]>wrote:
> > > > >
> > > > > > Hi,
> > > > > > I have been running a map reduce job that joins 2 datasets of 1.3
> > > and 4
> > > > > GB
> > > > > > in size. Joining is done at reduce side. Output is written to
> > either
> > > > > Hbase
> > > > > > or HDFS depending upon configuration. The problem I am having is
> > that
> > > > > Hbase
> > > > > > takes about 60-80 minutes to write the processed data, on the
> other
> > > > hand
> > > > > > HDFS takes only 3-5 mins to write the same data. I really want to
> > > > improve
> > > > > > the Hbase speed and bring it down to 1-2 min.
> > > > > >
> > > > > > I am using amazon EC2 instances, launched a cluster of size 3 and
> > > later
> > > > > 10,
> > > > > > have tried both c3.4xlarge and c3.8xlarge instances.
> > > > > >
> > > > > > I can see significant increase in performance while writing to
> HDFS
> > > as
> > > > i
> > > > > > use cluster with more nodes, having high specifications, but in
> the
> > > > case
> > > > > of
> > > > > > Hbase there was no significant change in performance.
> > > > > >
> > > > > > I have been going through different posts, articles and have read
> > > Hbase
> > > > > > book to solve the Hbase performance issue but have not been able
> to
> > > > > succeed
> > > > > > so far.
> > > > > > Here are the few things i have tried out so far:
> > > > > >
> > > > > > *Client Side*
> > > > > > - Turned off writing to WAL
> > > > > > - Experimented with write buffer size
> > > > > > - Turned off auto flush on table
> > > > > > - Used cache, experimented with different sizes
> > > > > >
> > > > > >
> > > > > > *Hbase Server Side*
> > > > > > - Increased region servers heap size to 8 GB
> > > > > > - Experimented with handlers count
> > > > > > - Increased Memstore flush size to 512 MB
> > > > > > - Experimented with hbase.hregion.max.filesize, tried different
+
kiran 2014-09-06, 18:30