Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> State of Art in Hadoop Log aggregation


Copy link to this message
-
State of Art in Hadoop Log aggregation
Hi Guys,

We have fairly decent sized Hadoop cluster of about 200 nodes and was
wondering what is the state of art if I want to aggregate and visualize
Hadoop ecosystem logs, particularly

   1. Tasktracker logs
   2. Datanode logs
   3. Hbase RegionServer logs

One way is to use something like a Flume on each node to aggregate the logs
and then use something like Kibana -
http://www.elasticsearch.org/overview/kibana/ to visualize the logs and
make them searchable.

However I don't want to write another ETL for the hadoop/hbase logs
 themselves. We currently log in to each machine individually to 'tail -F
logs' when there is an hadoop problem on a particular node.

We want a better way to look at the hadoop logs themselves in a centralized
way when there is an issue without having to login to 100 different
machines and was wondering what is the state of are in this regard.

Suggestions/Pointers are very welcome!!

Sagar
+
Alexander Alten-Lorenz 2013-10-11, 13:54
+
DSuiter RDX 2013-10-11, 14:05
+
Raymond Tay 2013-10-11, 05:38
+
Pradeep Gollakota 2013-10-11, 06:19
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB