Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Poor HBase map-reduce scan performance


Copy link to this message
-
Re: Poor HBase map-reduce scan performance
This happens when your java process is running in debug mode and
suspend='Y' option is selected.

Regards
Ram
On Wed, May 1, 2013 at 12:55 PM, Naidu MS
<[EMAIL PROTECTED]>wrote:

> Hi i have two questions regarding hdfs and jps utility
>
> I am new to Hadoop and started leraning hadoop from the past week
>
> 1.when ever i start start-all.sh and jps in console it showing the
> processes started
>
> *naidu@naidu:~/work/hadoop-1.0.4/bin$ jps*
> *22283 NameNode*
> *23516 TaskTracker*
> *26711 Jps*
> *22541 DataNode*
> *23255 JobTracker*
> *22813 SecondaryNameNode*
> *Could not synchronize with target*
>
> But along with the list of process stared it always showing *" Could not
> synchronize with target" *in the jps output. What is meant by "Could not
> synchronize with target"?  Can some one explain why this is happening?
>
>
> 2.Is it possible to format namenode multiple  times? When i enter the
>  namenode -format command, it not formatting the name node and showing the
> following ouput.
>
> *naidu@naidu:~/work/hadoop-1.0.4/bin$ hadoop namenode -format*
> *Warning: $HADOOP_HOME is deprecated.*
> *
> *
> *13/05/01 12:08:04 INFO namenode.NameNode: STARTUP_MSG: *
> */*************************************************************
> *STARTUP_MSG: Starting NameNode*
> *STARTUP_MSG:   host = naidu/127.0.0.1*
> *STARTUP_MSG:   args = [-format]*
> *STARTUP_MSG:   version = 1.0.4*
> *STARTUP_MSG:   build > https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r
> 1393290; compiled by 'hortonfo' on Wed Oct  3 05:13:58 UTC 2012*
> *************************************************************/*
> *Re-format filesystem in /home/naidu/dfs/namenode ? (Y or N) y*
> *Format aborted in /home/naidu/dfs/namenode*
> *13/05/01 12:08:05 INFO namenode.NameNode: SHUTDOWN_MSG: *
> */*************************************************************
> *SHUTDOWN_MSG: Shutting down NameNode at naidu/127.0.0.1*
> *
> *
> *************************************************************/*
>
> Can someone help me in understanding this? Why is it not possible to format
> name node multiple times?
>
>
> On Wed, May 1, 2013 at 12:22 PM, Matt Corgan <[EMAIL PROTECTED]> wrote:
>
> > Not that it's a long-term solution, but try major-compacting before
> running
> > the benchmark.  If the LSM tree is CPU bound in merging HFiles/KeyValues
> > through the PriorityQueue, then reducing to a single file per region
> should
> > help.  The merging of HFiles during a scan is not heavily optimized yet.
> >
> >
> > On Tue, Apr 30, 2013 at 11:21 PM, lars hofhansl <[EMAIL PROTECTED]>
> wrote:
> >
> > > If you can, try 0.94.4+; it should significantly reduce the amount of
> > > bytes copied around in RAM during scanning, especially if you have wide
> > > rows and/or large key portions. That in turns makes scans scale better
> > > across cores, since RAM is shared resource between cores (much like
> > disk).
> > >
> > >
> > > It's not hard to build the latest HBase against Cloudera's version of
> > > Hadoop. I can send along a simple patch to pom.xml to do that.
> > >
> > > -- Lars
> > >
> > >
> > >
> > > ________________________________
> > >  From: Bryan Keller <[EMAIL PROTECTED]>
> > > To: [EMAIL PROTECTED]
> > > Sent: Tuesday, April 30, 2013 11:02 PM
> > > Subject: Re: Poor HBase map-reduce scan performance
> > >
> > >
> > > The table has hashed keys so rows are evenly distributed amongst the
> > > regionservers, and load on each regionserver is pretty much the same. I
> > > also have per-table balancing turned on. I get mostly data local
> mappers
> > > with only a few rack local (maybe 10 of the 250 mappers).
> > >
> > > Currently the table is a wide table schema, with lists of data
> structures
> > > stored as columns with column prefixes grouping the data structures
> (e.g.
> > > 1_name, 1_address, 1_city, 2_name, 2_address, 2_city). I was thinking
> of
> > > moving those data structures to protobuf which would cut down on the
> > number
> > > of columns. The downside is I can't filter on one value with that, but