Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # dev >> memory usage & process distribution

Copy link to this message
memory usage & process distribution
Hey guys,

I want to set up a realistic production cluster on Amazon's EC2 and I am
trying to decide 2 things.
   -  Memory usage

If I use one of the example configuration files, say the 512MB does that
mean that all Accumulo processes will use up a total of 512MB? At least
this appears to be the case when looking at the accumulo-env.sh
This will determine weather I use a small or large instance.
   - Process Distribution

Is this a standard configuration? I will start off with a small # of worker
nodes ( 3-4 ) & hope to use my local machine as a "monitor" for the
accumulo & ganglia web UI's in order to avoid ssh -X latency.

[ Name Node ] Name Node, Gmond
[ Secondary NN ] Secondary Name Node, Gmond
[ Job Tracker ] JobTracker, Gmond
[ Zookeeper ] Zookeeper
[ Accumulo Master ] Master, Tracer, Garbage Collector, Gmond, Jmxtrans
[ Monitor ] Monitor, Gmetad, Gweb
[ Worker Node ] DataNode, Tasktracker, TabletServer, Logger, Gmond, Jmxtrans