Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Re: JobCache directory cleanup


+
Ivan Tretyakov 2013-01-11, 09:58
Copy link to this message
-
Re: JobCache directory cleanup
Hmm. Unfortunately, there is another config variable that may be affecting
this: keep.task.files.pattern

This is set to .* in the job.xml file you sent. I suspect this may be
causing a problem. Can you please remove this, assuming you have not set it
intentionally ?

Thanks
Hemanth

On Fri, Jan 11, 2013 at 3:28 PM, Ivan Tretyakov <[EMAIL PROTECTED]
> wrote:

> Thanks for replies!
>
> keep.failed.task.files set to false.
> Config of one of the jobs attached.
>
>
> On Fri, Jan 11, 2013 at 5:44 AM, Hemanth Yamijala <
> [EMAIL PROTECTED]> wrote:
>
>> Good point. Forgot that one :-)
>>
>>
>> On Thu, Jan 10, 2013 at 10:53 PM, Vinod Kumar Vavilapalli <
>> [EMAIL PROTECTED]> wrote:
>>
>>>
>>>
>>> Can you check the job configuration for these ~100 jobs? Do they have
>>> keep.failed.task.files set to true? If so, these files won't be deleted. If
>>> it doesn't, it could be a bug.
>>>
>>> Sharing your configs for these jobs will definitely help.
>>>
>>> Thanks,
>>> +Vinod
>>>
>>>
>>> On Wed, Jan 9, 2013 at 6:41 AM, Ivan Tretyakov <
>>> [EMAIL PROTECTED]> wrote:
>>>
>>>> Hello!
>>>>
>>>> I've found that jobcache directory became very large on our cluster,
>>>> e.g.:
>>>>
>>>> # du -sh /data?/mapred/local/taskTracker/user/jobcache
>>>> 465G    /data1/mapred/local/taskTracker/user/jobcache
>>>> 464G    /data2/mapred/local/taskTracker/user/jobcache
>>>> 454G    /data3/mapred/local/taskTracker/user/jobcache
>>>>
>>>> And it stores information for about 100 jobs:
>>>>
>>>> # ls -1 /data?/mapred/local/taskTracker/persona/jobcache/  | sort |
>>>> uniq | wc -l
>>>>
>>>
>>
>
>
> --
> Best Regards
> Ivan Tretyakov
>
> Deployment Engineer
> Grid Dynamics
> +7 812 640 38 76
> Skype: ivan.tretyakov
> www.griddynamics.com
> [EMAIL PROTECTED]
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB