Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo, mail # user - Memory setting recommendations for Accumulo / Hadoop


Copy link to this message
-
Re: Memory setting recommendations for Accumulo / Hadoop
Mike Hugo 2013-03-12, 19:22
Many thanks, Krishmin!

I had set "nofile" but not "nproc" - the references you sent were helpful.
 I had to increase the value in /etc/security/limits.d/90-nproc.conf and
now we're up and running!

Thanks again,

Mike
On Tue, Mar 12, 2013 at 1:21 PM, Krishmin Rai <[EMAIL PROTECTED]> wrote:

> Have you also increased the maximum number of processes ("nproc" in the
> same file)? I have definitely seen this kind of error as a result of in
> insufficiently large process limit.
>
> Some more details, maybe, on these pages:
>
> http://ww2.cs.fsu.edu/~czhang/errors.html
>
> http://incubator.apache.org/ambari/1.2.0/installing-hadoop-using-ambari/content/ambari-chap5-3-1.html
>
> -Krishmin
>
>
> On Mar 12, 2013, at 1:52 PM, Mike Hugo wrote:
>
> Eventually it will be 4 nodes, this particular test was running on a
> single node
>
> hadoop version is 1.0.4
>
> we already upped the limits in /etc/security/limits.conf to:
>
> usernamehere    hard    nofile           16384
>
> Mike
>
>
> On Tue, Mar 12, 2013 at 12:49 PM, Krishmin Rai <[EMAIL PROTECTED]>wrote:
>
>> Hi Mike,
>>   This could be related to the maximum number of processes or files
>> allowed for your linux user. You might try bumping these values up (e.g via
>> /etc/security/limits.conf).
>>
>> -Krishmin
>>
>> On Mar 12, 2013, at 1:35 PM, Mike Hugo wrote:
>>
>> > Hello,
>> >
>> > I'm setting up accumulo on a small cluster where each node has 96GB of
>> ram and 24 cores.  Any recommendations on what memory settings to use for
>> the accumulo processes, as well as what to use for the hadoop processes
>> (e.g. datanode, etc)?
>> >
>> > I did a small test just to try some things standalone on a single node,
>> setting the accumulo processes to 2GB of ram and the HADOOP_HEAPSIZE=2000.
>>  While running a map reduce job with 4 workers (each allocated 1GB of RAM),
>> the datanode runs out of memory about 25% of the way into the job and dies.
>>  The job is basically building an index, iterating over data in one table
>> and applying mutations to another - nothing too fancy.
>> >
>> > Since I'm dealing with a subset of data, I set the table split
>> threshold to 128M for testing purposes, there are currently about 170
>> tablets so we not dealing with a ton of data here. Might this low split
>> threshold be a contributing factor?
>> >
>> > Should I increase the HADDOP_HEAPSIZE even further?  Or will that just
>> delay the inevitable OOM error?
>> >
>> > The exception we are seeing is below.
>> >
>> > ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
>> DatanodeRegistration(...):DataXceiveServer: Exiting due
>> to:java.lang.OutOfMemoryError: unable to create new native thread
>> >         at java.lang.Thread.start0(Native Method)
>> >         at java.lang.Thread.start(Unknown Source)
>> >         at
>> org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:133)
>> >         at java.lang.Thread.run(Unknown Source)
>> >
>> >
>> > Thanks for your help!
>> >
>> > Mike
>>
>>
>
>