I'm not sure if my answer can be applied in your case, but sharing because
I found it interesting !
I saw a cluster where the path containing "Hadoop installation directory"
is on an NFS and is mounted across all the slaves in the same path. This
has 2 advantages,
1) Logs are all written to the same NFS and there is no need of aggregation
2) Hadoop upgrade becomes easy as we just need to update the tar in one
location, basically maintenance becomes easy.
But make sure you configure data directories to local disks of slaves,
otherwise they end of writing everything to NFS !
On Thu, Nov 22, 2012 at 1:01 AM, Dino Kečo <[EMAIL PROTECTED]> wrote:
> We had similar requirement and we built small Java application which gets
> information about task nodes from Job Tracker and download logs into one
> file using URLs of each task tracker.
> For huge logs this becomes slow and time consuming.
> Hope this helps.
> Dino Kečo
> msn: [EMAIL PROTECTED]
> mail: [EMAIL PROTECTED]
> skype: dino.keco
> phone: +387 61 507 851
> On Wed, Nov 21, 2012 at 7:55 PM, Jean-Marc Spaggiari <
> [EMAIL PROTECTED]> wrote:
>> When we run a MapReduce job, the logs are stored on all the tasktracker
>> Is there an easy way to agregate all those logs together and see them
>> in a single place instead of going to the tasks one by one and open
>> the file?