|
|
-
Re: Memory setting recommendations for Accumulo / HadoopJohn Vines 2013-03-12, 17:42
2g really should be enough, it's a bit concerning. How many nodes are you
dealing with, as that could be a factor. And which version of hadoop are you running? On Tue, Mar 12, 2013 at 1:35 PM, Mike Hugo <[EMAIL PROTECTED]> wrote: > Hello, > > I'm setting up accumulo on a small cluster where each node has 96GB of ram > and 24 cores. Any recommendations on what memory settings to use for the > accumulo processes, as well as what to use for the hadoop processes (e.g. > datanode, etc)? > > I did a small test just to try some things standalone on a single node, > setting the accumulo processes to 2GB of ram and the HADOOP_HEAPSIZE=2000. > While running a map reduce job with 4 workers (each allocated 1GB of RAM), > the datanode runs out of memory about 25% of the way into the job and dies. > The job is basically building an index, iterating over data in one table > and applying mutations to another - nothing too fancy. > > Since I'm dealing with a subset of data, I set the table split threshold > to 128M for testing purposes, there are currently about 170 tablets so we > not dealing with a ton of data here. Might this low split threshold be a > contributing factor? > > Should I increase the HADDOP_HEAPSIZE even further? Or will that just > delay the inevitable OOM error? > > The exception we are seeing is below. > > ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(...):DataXceiveServer: Exiting due > to:java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Unknown Source) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:133) > at java.lang.Thread.run(Unknown Source) > > > Thanks for your help! > > Mike > |