Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> State of Art in Hadoop Log aggregation

Copy link to this message
Re: State of Art in Hadoop Log aggregation
There are plenty of log aggregation tools both open source and commercial
off the shelf. Here's some

My personal recommendation is LogStash.
On Thu, Oct 10, 2013 at 10:38 PM, Raymond Tay <[EMAIL PROTECTED]>wrote:

> You can try Chukwa which is part of the incubating projects under Apache.
> Tried it before and liked it for aggregating logs.
> On 11 Oct, 2013, at 1:36 PM, Sagar Mehta <[EMAIL PROTECTED]> wrote:
> Hi Guys,
> We have fairly decent sized Hadoop cluster of about 200 nodes and was
> wondering what is the state of art if I want to aggregate and visualize
> Hadoop ecosystem logs, particularly
>    1. Tasktracker logs
>    2. Datanode logs
>    3. Hbase RegionServer logs
> One way is to use something like a Flume on each node to aggregate the
> logs and then use something like Kibana -
> http://www.elasticsearch.org/overview/kibana/ to visualize the logs and
> make them searchable.
> However I don't want to write another ETL for the hadoop/hbase logs
>  themselves. We currently log in to each machine individually to 'tail -F
> logs' when there is an hadoop problem on a particular node.
> We want a better way to look at the hadoop logs themselves in a
> centralized way when there is an issue without having to login to 100
> different machines and was wondering what is the state of are in this
> regard.
> Suggestions/Pointers are very welcome!!
> Sagar