The filecrush tool has a small utility called Clean that accepts and
age argument and deletes all the files in a directory older then a
We use clean to clean up the tmp hdfs directories applications leave
On 6/1/12, Vinod Singh <[EMAIL PROTECTED]> wrote:
> Yes, that is how I do. Though 1 month is too long, I keep it just 2 days.
> On Fri, Jun 1, 2012 at 2:15 PM, Ruben de Vries
> <[EMAIL PROTECTED]>wrote:
>> So I should write a job which cleans up 1 month old results or something
>> like that?
>> From: Vinod Singh [mailto:[EMAIL PROTECTED]]
>> Sent: Friday, June 01, 2012 10:35 AM
>> To: [EMAIL PROTECTED]
>> Subject: Re: Hive scratch dir not cleaning up
>> Hive deletes job contents from the scratch directory on completion of the
>> job. Though failed / killed jobs leave data there, which needs to be
>> removed manually.
>> On Fri, Jun 1, 2012 at 1:58 PM, Ruben de Vries <[EMAIL PROTECTED]>
>> Hey Hivers,
>> I’m almost ready to replace our old hadoop implementation with a
>> implementation using Hive,
>> Now I’ve ran into (hopefully) my last problem; my /tmp/hive-hduser dir is
>> getting kinda big!
>> It doesn’t seem to cleanup this tmp files, googling for it I run into
>> tickets about a cleanup setting, should I enable this with the below
>> Why doesn’t it do that by default? Am I the only one somehow racking up a
>> lot of space with tmp files?