Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Hbase Performance Issue


Copy link to this message
-
Re: Hbase Performance Issue
Suraj Varma 2014-01-07, 23:53
Akhtar:
There is no manual step for bulk load. You essentially have your script
that runs the map reduce job that creates the HFiles. On success of this
script/command, you run the completebulkload command ... the whole bulk
load can be automated, just like your map reduce job.

--Suraj
On Mon, Jan 6, 2014 at 11:42 AM, Mike Axiak <[EMAIL PROTECTED]> wrote:

> I suggest you look at hannibal [1] to look at the distribution of the data
> on your cluster:
>
> 1: https://github.com/sentric/hannibal
>
>
> On Mon, Jan 6, 2014 at 2:14 PM, Doug Meil <[EMAIL PROTECTED]
> >wrote:
>
> >
> > In addition to what everybody else said, look what *where* the regions
> are
> > for the target table.  There may be 5 regions (for example), but look to
> > see if they are all on the same RS.
> >
> >
> >
> >
> >
> > On 1/6/14 5:45 AM, "Nicolas Liochon" <[EMAIL PROTECTED]> wrote:
> >
> > >It's very strange that you don't see a perf improvement when you
> increase
> > >the number of nodes.
> > >Nothing in what you've done change the performances at the end?
> > >
> > >You may want to check:
> > > - the number of regions for this table. Are all the region server busy?
> > >Do
> > >you have some split on the table?
> > > - How much data you actually write. Is the compression enabled on this
> > >table?
> > > - Do you have compactions? You may want to change the max store file
> > >settings for unfrequent write load (see
> > >http://gbif.blogspot.fr/2012/07/optimizing-writes-in-hbase.html).
> > >
> > >It would be interesting to test as well the 0.96 release.
> > >
> > >
> > >
> > >On Sun, Jan 5, 2014 at 2:12 AM, Vladimir Rodionov
> > ><[EMAIL PROTECTED]>wrote:
> > >
> > >>
> > >> I think in this case, writing data to HDFS or HFile directly (for
> > >> subsequent bulk loading)
> > >> is the best option. HBase will never compete in write speed with HDFS.
> > >>
> > >> Best regards,
> > >> Vladimir Rodionov
> > >> Principal Platform Engineer
> > >> Carrier IQ, www.carrieriq.com
> > >> e-mail: [EMAIL PROTECTED]
> > >>
> > >> ________________________________________
> > >> From: Ted Yu [[EMAIL PROTECTED]]
> > >> Sent: Saturday, January 04, 2014 2:33 PM
> > >> To: [EMAIL PROTECTED]
> > >> Subject: Re: Hbase Performance Issue
> > >>
> > >> There're 8 items under:
> > >> http://hbase.apache.org/book.html#perf.writing
> > >>
> > >> I guess you have through all of them :-)
> > >>
> > >>
> > >> On Sat, Jan 4, 2014 at 1:34 PM, Akhtar Muhammad Din
> > >> <[EMAIL PROTECTED]>wrote:
> > >>
> > >> > Thanks guys for your precious time.
> > >> > Vladimir, as Ted rightly said i want to improve write performance
> > >> currently
> > >> > (of course i want to read data as fast as possible later on)
> > >> > Kevin, my current understanding of bulk load is that you generate
> > >> > StoreFiles and later load through a command line program. I dont
> want
> > >>to
> > >> do
> > >> > any manual step. Our system is getting data after every 15 minutes,
> so
> > >> > requirement is to automate it through client API completely.
> > >> >
> > >> >
> > >>
> > >> Confidentiality Notice:  The information contained in this message,
> > >> including any attachments hereto, may be confidential and is intended
> > >>to be
> > >> read only by the individual or entity to whom this message is
> > >>addressed. If
> > >> the reader of this message is not the intended recipient or an agent
> or
> > >> designee of the intended recipient, please note that any review, use,
> > >> disclosure or distribution of this message or its attachments, in any
> > >>form,
> > >> is strictly prohibited.  If you have received this message in error,
> > >>please
> > >> immediately notify the sender and/or [EMAIL PROTECTED] and
> > >> delete or destroy any copy of this message and its attachments.
> > >>
> >
> >
>