Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS, mail # user - Re: Auto clean DistCache?


+
Abdelrhman Shettia 2013-03-26, 19:53
+
Vinod Kumar Vavilapalli 2013-03-26, 20:44
+
Abdelrahman Shettia 2013-03-26, 23:12
Copy link to this message
-
Re: Auto clean DistCache?
Jean-Marc Spaggiari 2013-03-27, 01:00
For the situation I faced I was really a disk space issue, not related
to the number of files. It was writing on a small partition.

I will try with local.cache.size or
mapreduce.tasktracker.cache.local.size to see if I can keep the final
total size under 5GB... Else, I will go for a customed script to
delete all directories (and content) older than 2 or 3 days...

Thanks,

JM

2013/3/26 Abdelrahman Shettia <[EMAIL PROTECTED]>:
> Let me clarify , If there are lots of files or directories up to 32K (
> Depending on the user's # of files sys os config) in those distributed cache
> dirs, The OS will not be able to create any more files/dirs, Thus M-R jobs
> wont get initiated on those tasktracker machines. Hope this helps.
>
>
> Thanks
>
>
> On Tue, Mar 26, 2013 at 1:44 PM, Vinod Kumar Vavilapalli
> <[EMAIL PROTECTED]> wrote:
>>
>>
>> All the files are not opened at the same time ever, so you shouldn't see
>> any "# of open files exceeds error".
>>
>> Thanks,
>> +Vinod Kumar Vavilapalli
>> Hortonworks Inc.
>> http://hortonworks.com/
>>
>> On Mar 26, 2013, at 12:53 PM, Abdelrhman Shettia wrote:
>>
>> Hi JM ,
>>
>> Actually these dirs need to be purged by a script that keeps the last 2
>> days worth of files, Otherwise you may run into # of open files exceeds
>> error.
>>
>> Thanks
>>
>>
>> On Mar 25, 2013, at 5:16 PM, Jean-Marc Spaggiari <[EMAIL PROTECTED]>
>> wrote:
>>
>> Hi,
>>
>>
>> Each time my MR job is run, a directory is created on the TaskTracker
>>
>> under mapred/local/taskTracker/hadoop/distcache (based on my
>>
>> configuration).
>>
>>
>> I looked at the directory today, and it's hosting thousands of
>>
>> directories and more than 8GB of data there.
>>
>>
>> Is there a way to automatically delete this directory when the job is
>> done?
>>
>>
>> Thanks,
>>
>>
>> JM
>>
>>
>>
>
+
Koji Noguchi 2013-03-27, 13:21
+
Jean-Marc Spaggiari 2013-03-27, 13:37
+
Harsh J 2013-03-28, 06:33
+
Jean-Marc Spaggiari 2013-03-28, 16:02
+
Vinod Kumar Vavilapalli 2013-03-26, 20:43