-Archive Task Logs (Stdout, Stderr, Sysogs) and Job Tracker logs of a Hadoop Cluster for later analysis
I need to collect log data from our Cluster.
For this I think I need to copy the Contents of:
* JobTracker: /var/log/hadoop-0.20-mapreduce/history/
* TaskTracker: /var/log/hadoop-0.20-mapreduce/userlogs/
It should also follow symlinks and copy recusrive.
Is flume the right tool to do this?
E.g. with the "Spooling Directory Source"?