|
|
+
Mike Hugo 2013-03-07, 17:37
+
John Vines 2013-03-07, 18:26
-
Re: Cleanup of distcache after running map reduce jobsMike Hugo 2013-03-07, 20:18
Thanks John!
I ended up playing with some settings in mapred-site.xml, namely mapreduce.tasktracker.local.cache.numberdirectories and local.cache.size and that seems to have resolved our issue for the moment. Mike On Thu, Mar 7, 2013 at 12:26 PM, John Vines <[EMAIL PROTECTED]> wrote: > The cache will clear itself after 24 hours if I remember correctly. I have > hit this issue before and, provided your hitting the same issue I've seen > before, you're options are to either- > 1. up the number of inodes for your system > 2. add accumulo to the child opts classpath via mapred-site.xml and then > use the normal hadoop command to kick off your job instead of the > accumulo/tool.sh script > > > On Thu, Mar 7, 2013 at 12:37 PM, Mike Hugo <[EMAIL PROTECTED]> wrote: > >> We noticed that after running several thousand map reduce jobs that our >> file system was filling up. The culprit is the libjars that are getting >> uploaded to the distributed cache for each job - doesn't look like they're >> ever being deleted. >> >> Is there a mechanism to clear the distributed cache (or should this >> happen automatically). >> >> This is probably a straight up hadoop question, but I'm asking here first >> in case you've seen this sort of thing with accumulo before. >> >> Thanks! >> >> Mike >> > > |