-Re: memory usage & process distribution
Miguel Pereira 2012-07-23, 17:19
For configuring map reduce do you mean adding the
properties to the mapred-site.xml ?
On Mon, Jul 23, 2012 at 11:33 AM, John Vines <[EMAIL PROTECTED]> wrote:
> On Mon, Jul 23, 2012 at 11:21 AM, Miguel Pereira
> <[EMAIL PROTECTED]>wrote:
> > Hey guys,
> > I want to set up a realistic production cluster on Amazon's EC2 and I am
> > trying to decide 2 things.
> > - Memory usage
> > If I use one of the example configuration files, say the 512MB does that
> > mean that all Accumulo processes will use up a total of 512MB? At least
> > this appears to be the case when looking at the accumulo-env.sh
> > This will determine weather I use a small or large instance.
> Yes, it sets it up so all of the Accumulo processes have a footprint no
> bigger than 512MB. Mind you, we only have one configuration that is set up
> for things in a distributed fashion, which is 3GB. So if you're running
> multiple nodes, you can up some of the configurations for a larger
> footprint because you won't be running every process on every node.
> > - Process Distribution
> > Is this a standard configuration? I will start off with a small # of
> > nodes ( 3-4 ) & hope to use my local machine as a "monitor" for the
> > accumulo & ganglia web UI's in order to avoid ssh -X latency.
> > [ Name Node ] Name Node, Gmond
> > [ Secondary NN ] Secondary Name Node, Gmond
> > [ Job Tracker ] JobTracker, Gmond
> > [ Zookeeper ] Zookeeper
> > [ Accumulo Master ] Master, Tracer, Garbage Collector, Gmond, Jmxtrans
> > [ Monitor ] Monitor, Gmetad, Gweb
> > [ Worker Node ] DataNode, Tasktracker, TabletServer, Logger, Gmond,
> > Jmxtrans
> > That looks good to me. Just make sure you configure your map reduce to
> that child memory * (reduce slots + map slots) aren't enough to cause
> > Thanks,
> > Miguel