Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # dev >> memory usage & process distribution


Copy link to this message
-
Re: memory usage & process distribution
John,

For configuring map reduce do you mean adding the

mapred.local.dir
mapred.system.dir
mapred.temp.dir

properties to the mapred-site.xml ?

On Mon, Jul 23, 2012 at 11:33 AM, John Vines <[EMAIL PROTECTED]> wrote:

> On Mon, Jul 23, 2012 at 11:21 AM, Miguel Pereira
> <[EMAIL PROTECTED]>wrote:
>
> > Hey guys,
> >
> > I want to set up a realistic production cluster on Amazon's EC2 and I am
> > trying to decide 2 things.
> >
> >
> >    -  Memory usage
> >
> > If I use one of the example configuration files, say the 512MB does that
> > mean that all Accumulo processes will use up a total of 512MB? At least
> > this appears to be the case when looking at the accumulo-env.sh
> > This will determine weather I use a small or large instance.
> >
> >
> >
> Yes, it sets it up so all of the Accumulo processes have a footprint no
> bigger than 512MB. Mind you, we only have one configuration that is set up
> for things in a distributed fashion, which is 3GB. So if you're running
> multiple nodes, you can up some of the configurations for a larger
> footprint because you won't be running every process on every node.
>
>
> >    - Process Distribution
> >
> > Is this a standard configuration? I will start off with a small # of
> worker
> > nodes ( 3-4 ) & hope to use my local machine as a "monitor" for the
> > accumulo & ganglia web UI's in order to avoid ssh -X latency.
> >
> > [ Name Node ] Name Node, Gmond
> > [ Secondary NN ] Secondary Name Node, Gmond
> > [ Job Tracker ] JobTracker, Gmond
> > [ Zookeeper ] Zookeeper
> > [ Accumulo Master ] Master, Tracer, Garbage Collector, Gmond, Jmxtrans
> > [ Monitor ] Monitor, Gmetad, Gweb
> > [ Worker Node ] DataNode, Tasktracker, TabletServer, Logger, Gmond,
> > Jmxtrans
> >
> > That looks good to me. Just make sure you configure your map reduce to
> that child memory * (reduce slots + map slots) aren't enough to cause
> swapping.
>
> >
> > Thanks,
> >
> > Miguel
> >
>
> John
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB