Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Is there a way to turn off MAPREDUCE-2415?

Copy link to this message
Re: Is there a way to turn off MAPREDUCE-2415?
Hey Harsh,
Thanks for responding!
Would limiting the logging for each task via mapred.userlog.limit.kb be
strictly enforced (while the job is running)? That would solve my issue of
runaway logging on a job filling up the datanode disks. I would set the
limit high since in general i do want to retain logs, just not in case a
single rogue job starts producing many gigabytes of logs.

On Sun, Aug 26, 2012 at 1:44 PM, Harsh J <[EMAIL PROTECTED]> wrote:

> Hi Koert,
> To answer on point, there is no turning off this feature.
> Since you don't seem to care much for logs from tasks persisting,
> perhaps consider lowering the mapred.userlog.retain.hours to a lower
> value than 24 hours (such as 1h)? Or you may even limit the logging
> from each task to a certain amount of KB via mapred.userlog.limit.kb,
> which is unlimited by default.
> Would either of these work for you?
> On Sun, Aug 26, 2012 at 11:02 PM, Koert Kuipers <[EMAIL PROTECTED]> wrote:
> > We have smaller nodes (4 to 6 disks), and we used to write logs to the
> same
> > disk as where the OS is. So if that disks goes then i don't really care
> > about tasktrackers failing. Also, the fact that logs were written to a
> > single partition meant that i could make sure they would not grow too
> large
> > in case someone had too verbose logging on a large job. With
> > a job that does massive amount of logging can fill up all the
> > mapred.local.dir, which in our case are on the same partition as the hdfs
> > data dirs, so now faulty logging can fill up hdfs storage, which i really
> > don't like. Any ideas?
> >
> >
> --
> Harsh J