Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> mslab enabled jvm crash


Copy link to this message
-
Re: mslab enabled jvm crash
Wayne, we get CMS failures also, I am pretty sure they are
fragmentation related:

2011-05-26T09:20:00.304-0700: 206371.599: [GC 206371.599: [ParNew
(promotion failed): 76633K->76023K(76672K), 0.0924180 secs]206371.692:
[CMS: 11452308K->7142504K(122
02816K), 13.5870310 secs] 11525447K->7142504K(12279488K), [CMS Perm :
18254K->18254K(30436K)] icms_dc=0 , 13.6796820 secs] [Times:
user=13.17 sys=0.64, real=13.68 sec
s]

The RS does not go away when this happens.   If your disks are not
overloaded, you should consider flushing sooner and deeper, e.g. flush
larger chunks of memory, and offload the load to the disks, this way,
you run will free up more memstore cache, and promotion of YG to
tenured has more chances to succeed without CMS Failure.

-Jack.

PS.  Wayne, are you on IM, I am "jacklevin74" on both AIM and Skype, lets chat.

On Thu, May 26, 2011 at 10:42 AM, Wayne <[EMAIL PROTECTED]> wrote:
> I left parnew alone (did not add any settings). I also did not increase the
> heap. 8g with 50% for memstore. Below are the JVM settings.
>
> The errors I pasted occurred after running for only maybe 12 hours. The
> cluster as a whole has been running for 24 hours with dropping a node, but
> short time span CMFs are occurring.
>
> Any recommendations?
>
> export HBASE_OPTS="-XX:+UseCMSInitiatingOccupancyOnly
> -XX:CMSInitiatingOccupancyFraction=65 -XX:+CMSParallelRemarkEnabled
> -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC"
>
> Thanks.
>
>
> On Thu, May 26, 2011 at 1:30 PM, Stack <[EMAIL PROTECTED]> wrote:
>
>> On Thu, May 26, 2011 at 9:00 AM, Wayne <[EMAIL PROTECTED]> wrote:
>> > Looking more closely I can see that we are still
>> > getting Concurrent Mode Failures on some of the nodes but they are only
>> > lasting for 10s so the nodes don't go away. Is this considered "normal"?
>> > With CMSInitiatingOccupancyFraction=65 I would suspect this is not
>> normal??
>> >
>>
>> What configs. are you running with now?  It looks like you either left
>> parnew as unbounded or else you set it to 256M max?   So you did not
>> change the parnew size?  Did you up your heap size?  I see you are
>> getting 'promotion failed'/'concurrent mode failure'.  How long has it
>> been running now?
>>
>> St.Ack
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB