Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Chukwa >> mail # user >> WaitingQueue - MemLimitQueue is full


+
Logan Hardy 2012-11-10, 22:17
Copy link to this message
-
Re: WaitingQueue - MemLimitQueue is full
Hi Logan,

It looks like the datanode is saturated when large mapreduce job is in
process.  Chukwa agent will drop data on the floor, if there is more data
that agent can be buffer in memory.  Are the collectors running on
datanode?  Do you have multiple disks for the datanode?  It maybe good to
map number of disks to (task slot - 1) and let chukwa collector write to a
disk that is not used concurrently by mapreduce task to provide good
performance for both data injection and data processing.

regards,
Eric

On Sat, Nov 10, 2012 at 2:17 PM, Logan Hardy <[EMAIL PROTECTED]>wrote:

> We are running CentOS 5.4, Chukwa 0.3.0, java version "1.6.0_17", and are
> feeding a steady stream of data into our CDH3u3 Hadoop cluster. We have 6
> Chukwa agent machines feeding 3 Chukwa collectors. Any time the cluster
> gets busy with a big job or the task of decommissioning a node the Chukwa
> agent and collector start to back up and and I start seeing "WaitingQueue -
> MemLimitQueue is full" messages in the agent.log as shown below. As soon as
> hadoop cluster activity dies down the MemLimitQueue messages go away and
> everything goes back to normal.
>
> [root@COLL5 chukwa]# ps auxf | grep chukwa
> root     11258  0.0  0.0  61172   732 pts/0    S+   15:15   0:00
>  \_ grep chukwa
> root     29248  1.2  2.1 415572 86928 ?        Sl   04:03   8:04
> /usr/java/default/bin/java -Xms32M -Xmx64M -DAPP=agent
> -Dlog4j.configuration=chukwa-log4j.properties
> -DCHUKWA_HOME=/usr/local/chukwa/bin/..
> -DCHUKWA_CONF_DIR=/usr/local/chukwa/bin/../conf
> -DCHUKWA_LOG_DIR=/usr/local/chukwa/logs -classpath
> /usr/local/chukwa/bin/../conf::/usr/local/chukwa/bin/../chukwa-agent-0.3.0.jar:/usr/local/chukwa/bin/../chukwa-core-0.3.0.jar:/usr/local/chukwa/bin/../hadoopjars/hadoop-0.20.0-core.jar:/usr/local/chukwa/bin/../lib/NagiosAppender-1.5.0.jar:/usr/local/chukwa/bin/../lib/ant-1.7.1.jar:/usr/local/chukwa/bin/../lib/ant-launcher-1.7.1.jar:/usr/local/chukwa/bin/../lib/asm-3.1.jar:/usr/local/chukwa/bin/../lib/commons-beanutils-1.8.0.jar:/usr/local/chukwa/bin/../lib/commons-cli-2.0-SNAPSHOT.jar:/usr/local/chukwa/bin/../lib/commons-codec-1.3.jar:/usr/local/chukwa/bin/../lib/commons-collections-3.1.jar:/usr/local/chukwa/bin/../lib/commons-fileupload-1.2.jar:/usr/local/chukwa/bin/../lib/commons-httpclient-3.0.1.jar:/usr/local/chukwa/bin/../lib/commons-io-1.4.jar:/usr/local/chukwa/bin/../lib/commons-lang-2.4.jar:/usr/local/chukwa/bin/../lib/commons-logging-1.1.1.jar:/usr/local/chukwa/bin/../lib/commons-logging-api-1.0.4.jar:/usr/local/chukwa/bin/../lib/commons-net-1.4.1.jar:/usr/local/chukwa/bin/../lib/core-3.1.1.jar:/usr/local/chukwa/bin/../lib/ezmorph-1.0.6.jar:/usr/local/chukwa/bin/../lib/jchronic-0.2.3.jar:/usr/local/chukwa/bin/../lib/jersey-bundle-1.1.0-ea.jar:/usr/local/chukwa/bin/../lib/jetty-6.1.11.jar:/usr/local/chukwa/bin/../lib/jetty-util-6.1.11.jar:/usr/local/chukwa/bin/../lib/json-lib-2.2.3-jdk15.jar:/usr/local/chukwa/bin/../lib/json.jar:/usr/local/chukwa/bin/../lib/jsp-2.1-6.1.11.jar:/usr/local/chukwa/bin/../lib/jsp-api-2.1-6.1.11.jar:/usr/local/chukwa/bin/../lib/jsr311-api-1.0.jar:/usr/local/chukwa/bin/../lib/junit-3.8.1.jar:/usr/local/chukwa/bin/../lib/log4j-1.2.13.jar:/usr/local/chukwa/bin/../lib/mysql-connector-java-5.1.6.jar:/usr/local/chukwa/bin/../lib/prefuse.jar:/usr/local/chukwa/bin/../lib/servlet-api-2.5-6.1.11.jar
> org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent
>
>
> agent.log
> ........
> 2012-11-10 14:56:14,470 INFO Timer-0 ChukwaAgent - writing checkpoint 7257
> 2012-11-10 14:56:18,655 INFO Timer-1 HttpConnector - # http chunks ACK'ed
> since last report: 547
> 2012-11-10 14:56:20,163 INFO HTTP post thread ChukwaHttpSender - >>>>>>
> HTTP Got success back from http://10.5.200.204:8080/chukwa; response
> length 832
> 2012-11-10 14:56:20,163 INFO HTTP post thread HttpConnector - sent 13
> chunks, got back 13 acks
> 2012-11-10 14:56:20,163 INFO HTTP post thread ChukwaHttpSender - collected
> 13 chunks
> *2012-11-10 14:56:20,163 INFO Thread-6 WaitingQueue - MemLimitQueue is
+
Logan Hardy 2012-11-11, 17:49