Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> dfs storage full on all slave machines of 6 machine hive cluster


Copy link to this message
-
Re: dfs storage full on all slave machines of 6 machine hive cluster
Look into your hdfs-site.xml & mapred-site.xml conf files.

*dfs.data.dir* propety contain your actual HDFS data path, better avoid
deleting anything from these directories.

*mapred.local.dir* contains temporary map-reduce job data, you can clean
this one.

"/mnt/hadoop-fs/dfs/data/current/" looks like your hdfs data path, this
mean your hive tables have grown to ~95% of your disk size. try deleting
hive tables or add more disk ( dropping a EXTERNAL hive table doesn't clear
the data from HDFS)

Thanks,

On Mon, Mar 18, 2013 at 9:28 PM, Chunky Gupta <[EMAIL PROTECTED]>wrote:

> Hi Zhiwen,
>
> /mnt/hadoop-fs/mapred/local/taskTracker/
>
> Inside this folder there are folders with different user name, can I
> delete these ?.
>
> I do not understand what this {*nouserdir*} you were talking about, can
> you please explain ?.
>
> Thanks,
> Chunky.
>
>
>
> On Mon, Mar 18, 2013 at 8:40 PM, Zhiwen Sun <[EMAIL PROTECTED]> wrote:
>
>> The folder "/mnt/hadoop-fs/dfs/data/current/" is the main folder of
>> datanode in hadoop.
>>
>> You can use *hadoop dfs -rmr {nouserdir} *to get more free space in HDFS.
>>
>> *Don't delete file directly in OS file system.*
>>
>> Zhiwen Sun
>>
>>
>>
>> On Mon, Mar 18, 2013 at 6:48 PM, Manish Bhoge <[EMAIL PROTECTED]
>> > wrote:
>>
>>> I think these directories belong to task tracker temporary storage. I am
>>> not very confident to conclude that go ahead with your clean up. So, wait
>>> for similar or an expert's response
>>>
>>> Sent from HTC via Rocket! excuse typo.
>>>
>>>  ------------------------------
>>> * From: * Chunky Gupta <[EMAIL PROTECTED]>;
>>> * To: * <[EMAIL PROTECTED]>;
>>> * Subject: * dfs storage full on all slave machines of 6 machine hive
>>> cluster
>>> * Sent: * Mon, Mar 18, 2013 10:37:39 AM
>>>
>>>   Hi,
>>>
>>> We have a 6 machine hive cluster. We are getting errors while a query is
>>> running and it fails. I found that on all 5 slaves storage is nearly full (
>>> 96%, 98%, 100%, 97%, 98% storage used) .
>>>
>>> On my slaves machines, this folder "/mnt/hadoop-fs/dfs/data/current/" is
>>> contributing 95% storage used. It contains folders with names "subdir0",
>>> "subdir1", etc and under them there are many files with name like
>>> "blk_-4071357924681234567" and blk_-4071357924681234567_246813.meta:, etc.
>>>
>>> I want to delete these subdir folders but I am not sure if it will not
>>> affects the tables which I have created.
>>>
>>> Can anyone help me and tell me what are these folders used for ?.
>>>
>>> Thanks,
>>> Chunky.
>>>
>>
>>
>
--
Alok Kumar