Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Cleanup of distcache after running map reduce jobs

Copy link to this message
Re: Cleanup of distcache after running map reduce jobs
Thanks John!

I ended up playing with some settings in mapred-site.xml,
namely mapreduce.tasktracker.local.cache.numberdirectories
and local.cache.size and that seems to have resolved our issue for the
On Thu, Mar 7, 2013 at 12:26 PM, John Vines <[EMAIL PROTECTED]> wrote:

> The cache will clear itself after 24 hours if I remember correctly. I have
> hit this issue before and, provided your hitting the same issue I've seen
> before, you're options are to either-
> 1. up the number of inodes for your system
> 2. add accumulo to the child opts classpath via mapred-site.xml and then
> use the normal hadoop command to kick off your job instead of the
> accumulo/tool.sh script
> On Thu, Mar 7, 2013 at 12:37 PM, Mike Hugo <[EMAIL PROTECTED]> wrote:
>> We noticed that after running several thousand map reduce jobs that our
>> file system was filling up.  The culprit is the libjars that are getting
>> uploaded to the distributed cache for each job - doesn't look like they're
>> ever being deleted.
>> Is there a mechanism to clear the distributed cache (or should this
>> happen automatically).
>> This is probably a straight up hadoop question, but I'm asking here first
>> in case you've seen this sort of thing with accumulo before.
>> Thanks!
>> Mike