Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Poor HBase map-reduce scan performance


Copy link to this message
-
Re: Poor HBase map-reduce scan performance
This happens when your java process is running in debug mode and
suspend='Y' option is selected.

Regards
Ram
On Wed, May 1, 2013 at 12:55 PM, Naidu MS
<[EMAIL PROTECTED]>wrote:

> Hi i have two questions regarding hdfs and jps utility
>
> I am new to Hadoop and started leraning hadoop from the past week
>
> 1.when ever i start start-all.sh and jps in console it showing the
> processes started
>
> *naidu@naidu:~/work/hadoop-1.0.4/bin$ jps*
> *22283 NameNode*
> *23516 TaskTracker*
> *26711 Jps*
> *22541 DataNode*
> *23255 JobTracker*
> *22813 SecondaryNameNode*
> *Could not synchronize with target*
>
> But along with the list of process stared it always showing *" Could not
> synchronize with target" *in the jps output. What is meant by "Could not
> synchronize with target"?  Can some one explain why this is happening?
>
>
> 2.Is it possible to format namenode multiple  times? When i enter the
>  namenode -format command, it not formatting the name node and showing the
> following ouput.
>
> *naidu@naidu:~/work/hadoop-1.0.4/bin$ hadoop namenode -format*
> *Warning: $HADOOP_HOME is deprecated.*
> *
> *
> *13/05/01 12:08:04 INFO namenode.NameNode: STARTUP_MSG: *
> */*************************************************************
> *STARTUP_MSG: Starting NameNode*
> *STARTUP_MSG:   host = naidu/127.0.0.1*
> *STARTUP_MSG:   args = [-format]*
> *STARTUP_MSG:   version = 1.0.4*
> *STARTUP_MSG:   build > https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r
> 1393290; compiled by 'hortonfo' on Wed Oct  3 05:13:58 UTC 2012*
> *************************************************************/*
> *Re-format filesystem in /home/naidu/dfs/namenode ? (Y or N) y*
> *Format aborted in /home/naidu/dfs/namenode*
> *13/05/01 12:08:05 INFO namenode.NameNode: SHUTDOWN_MSG: *
> */*************************************************************
> *SHUTDOWN_MSG: Shutting down NameNode at naidu/127.0.0.1*
> *
> *
> *************************************************************/*
>
> Can someone help me in understanding this? Why is it not possible to format
> name node multiple times?
>
>
> On Wed, May 1, 2013 at 12:22 PM, Matt Corgan <[EMAIL PROTECTED]> wrote:
>
> > Not that it's a long-term solution, but try major-compacting before
> running
> > the benchmark.  If the LSM tree is CPU bound in merging HFiles/KeyValues
> > through the PriorityQueue, then reducing to a single file per region
> should
> > help.  The merging of HFiles during a scan is not heavily optimized yet.
> >
> >
> > On Tue, Apr 30, 2013 at 11:21 PM, lars hofhansl <[EMAIL PROTECTED]>
> wrote:
> >
> > > If you can, try 0.94.4+; it should significantly reduce the amount of
> > > bytes copied around in RAM during scanning, especially if you have wide
> > > rows and/or large key portions. That in turns makes scans scale better
> > > across cores, since RAM is shared resource between cores (much like
> > disk).
> > >
> > >
> > > It's not hard to build the latest HBase against Cloudera's version of
> > > Hadoop. I can send along a simple patch to pom.xml to do that.
> > >
> > > -- Lars
> > >
> > >
> > >
> > > ________________________________
> > >  From: Bryan Keller <[EMAIL PROTECTED]>
> > > To: [EMAIL PROTECTED]
> > > Sent: Tuesday, April 30, 2013 11:02 PM
> > > Subject: Re: Poor HBase map-reduce scan performance
> > >
> > >
> > > The table has hashed keys so rows are evenly distributed amongst the
> > > regionservers, and load on each regionserver is pretty much the same. I
> > > also have per-table balancing turned on. I get mostly data local
> mappers
> > > with only a few rack local (maybe 10 of the 250 mappers).
> > >
> > > Currently the table is a wide table schema, with lists of data
> structures
> > > stored as columns with column prefixes grouping the data structures
> (e.g.
> > > 1_name, 1_address, 1_city, 2_name, 2_address, 2_city). I was thinking
> of
> > > moving those data structures to protobuf which would cut down on the
> > number
> > > of columns. The downside is I can't filter on one value with that, but
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB