Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> dfs storage full on all slave machines of 6 machine hive cluster


Copy link to this message
-
Re: dfs storage full on all slave machines of 6 machine hive cluster
Hi,

1. For a table size in Hive :
$hive> describe extended <*tableName*>
** look for *location* tag in output **
run this *bin/hadoop dfs -du <hive-table-location>* at $HADOOP_HOME
directory to get the table size.
(It won't hold true for EXTENDED tables; Also not sure about size of any
database; )

2. HDFS stores data in distributed manner. It is difficult to get the
actual block location.
 ( A single table data could be spread across all 5 nodes )

Thanks,

On Tue, Mar 19, 2013 at 4:06 PM, Chunky Gupta <[EMAIL PROTECTED]>wrote:

> Thank Alok, I deleted mapred.local.dir folders.
> I more 2 question,
>
> 1. I have around 30 databases and each one contains many tables. So, is
> there any way to find out wat are the size of each database or how much
> storage a particular table in a database is occupying.
>
> 2. We have 5 slave nodes, how to find which tables data is stored on which
> slave node .
>
> Thanks,
> Chunky.
>
>
>
> On Mon, Mar 18, 2013 at 10:16 PM, Alok Kumar <[EMAIL PROTECTED]> wrote:
>
>> Look into your hdfs-site.xml & mapred-site.xml conf files.
>>
>> *dfs.data.dir* propety contain your actual HDFS data path, better avoid
>> deleting anything from these directories.
>>
>> *mapred.local.dir* contains temporary map-reduce job data, you can clean
>> this one.
>>
>> "/mnt/hadoop-fs/dfs/data/current/" looks like your hdfs data path, this
>> mean your hive tables have grown to ~95% of your disk size. try deleting
>> hive tables or add more disk ( dropping a EXTERNAL hive table doesn't clear
>> the data from HDFS)
>>
>> Thanks,
>>
>>
>> On Mon, Mar 18, 2013 at 9:28 PM, Chunky Gupta <[EMAIL PROTECTED]>wrote:
>>
>>> Hi Zhiwen,
>>>
>>> /mnt/hadoop-fs/mapred/local/taskTracker/
>>>
>>> Inside this folder there are folders with different user name, can I
>>> delete these ?.
>>>
>>> I do not understand what this {*nouserdir*} you were talking about, can
>>> you please explain ?.
>>>
>>> Thanks,
>>> Chunky.
>>>
>>>
>>>
>>> On Mon, Mar 18, 2013 at 8:40 PM, Zhiwen Sun <[EMAIL PROTECTED]> wrote:
>>>
>>>> The folder "/mnt/hadoop-fs/dfs/data/current/" is the main folder of
>>>> datanode in hadoop.
>>>>
>>>> You can use *hadoop dfs -rmr {nouserdir} *to get more free space in
>>>> HDFS.
>>>>
>>>> *Don't delete file directly in OS file system.*
>>>>
>>>> Zhiwen Sun
>>>>
>>>>
>>>>
>>>> On Mon, Mar 18, 2013 at 6:48 PM, Manish Bhoge <
>>>> [EMAIL PROTECTED]> wrote:
>>>>
>>>>> I think these directories belong to task tracker temporary storage. I
>>>>> am not very confident to conclude that go ahead with your clean up. So,
>>>>> wait for similar or an expert's response
>>>>>
>>>>> Sent from HTC via Rocket! excuse typo.
>>>>>
>>>>>  ------------------------------
>>>>> * From: * Chunky Gupta <[EMAIL PROTECTED]>;
>>>>> * To: * <[EMAIL PROTECTED]>;
>>>>> * Subject: * dfs storage full on all slave machines of 6 machine hive
>>>>> cluster
>>>>> * Sent: * Mon, Mar 18, 2013 10:37:39 AM
>>>>>
>>>>>   Hi,
>>>>>
>>>>> We have a 6 machine hive cluster. We are getting errors while a query
>>>>> is running and it fails. I found that on all 5 slaves storage is nearly
>>>>> full ( 96%, 98%, 100%, 97%, 98% storage used) .
>>>>>
>>>>> On my slaves machines, this folder "/mnt/hadoop-fs/dfs/data/current/"
>>>>> is contributing 95% storage used. It contains folders with names "subdir0",
>>>>> "subdir1", etc and under them there are many files with name like
>>>>> "blk_-4071357924681234567" and blk_-4071357924681234567_246813.meta:, etc.
>>>>>
>>>>> I want to delete these subdir folders but I am not sure if it will not
>>>>> affects the tables which I have created.
>>>>>
>>>>> Can anyone help me and tell me what are these folders used for ?.
>>>>>
>>>>> Thanks,
>>>>> Chunky.
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Alok Kumar
>>
>
>
--
Alok Kumar
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB