|
|
-
dfs storage full on all slave machines of 6 machine hive cluster
Chunky Gupta 2013-03-18, 10:37
Hi,
We have a 6 machine hive cluster. We are getting errors while a query is running and it fails. I found that on all 5 slaves storage is nearly full ( 96%, 98%, 100%, 97%, 98% storage used) .
On my slaves machines, this folder "/mnt/hadoop-fs/dfs/data/current/" is contributing 95% storage used. It contains folders with names "subdir0", "subdir1", etc and under them there are many files with name like "blk_-4071357924681234567" and blk_-4071357924681234567_246813.meta:, etc.
I want to delete these subdir folders but I am not sure if it will not affects the tables which I have created.
Can anyone help me and tell me what are these folders used for ?.
Thanks, Chunky.
-
Re: dfs storage full on all slave machines of 6 machine hive cluster
Manish Bhoge 2013-03-18, 10:48
I think these directories belong to task tracker temporary storage. I am not very confident to conclude that go ahead with your clean up. So, wait for similar or an expert's response 
Sent from HTC via Rocket! excuse typo.
-
Re: dfs storage full on all slave machines of 6 machine hive cluster
Zhiwen Sun 2013-03-18, 15:10
The folder "/mnt/hadoop-fs/dfs/data/current/" is the main folder of datanode in hadoop.
You can use *hadoop dfs -rmr {nouserdir} *to get more free space in HDFS.
*Don't delete file directly in OS file system.*
Zhiwen Sun
On Mon, Mar 18, 2013 at 6:48 PM, Manish Bhoge <[EMAIL PROTECTED]>wrote:
> I think these directories belong to task tracker temporary storage. I am > not very confident to conclude that go ahead with your clean up. So, wait > for similar or an expert's response > > Sent from HTC via Rocket! excuse typo. > > ------------------------------ > * From: * Chunky Gupta <[EMAIL PROTECTED]>; > * To: * <[EMAIL PROTECTED]>; > * Subject: * dfs storage full on all slave machines of 6 machine hive > cluster > * Sent: * Mon, Mar 18, 2013 10:37:39 AM > > Hi, > > We have a 6 machine hive cluster. We are getting errors while a query is > running and it fails. I found that on all 5 slaves storage is nearly full ( > 96%, 98%, 100%, 97%, 98% storage used) . > > On my slaves machines, this folder "/mnt/hadoop-fs/dfs/data/current/" is > contributing 95% storage used. It contains folders with names "subdir0", > "subdir1", etc and under them there are many files with name like > "blk_-4071357924681234567" and blk_-4071357924681234567_246813.meta:, etc. > > I want to delete these subdir folders but I am not sure if it will not > affects the tables which I have created. > > Can anyone help me and tell me what are these folders used for ?. > > Thanks, > Chunky. >
-
Re: dfs storage full on all slave machines of 6 machine hive cluster
Chunky Gupta 2013-03-18, 15:58
Hi Zhiwen,
/mnt/hadoop-fs/mapred/local/taskTracker/
Inside this folder there are folders with different user name, can I delete these ?.
I do not understand what this {*nouserdir*} you were talking about, can you please explain ?.
Thanks, Chunky. On Mon, Mar 18, 2013 at 8:40 PM, Zhiwen Sun <[EMAIL PROTECTED]> wrote:
> The folder "/mnt/hadoop-fs/dfs/data/current/" is the main folder of > datanode in hadoop. > > You can use *hadoop dfs -rmr {nouserdir} *to get more free space in HDFS. > > *Don't delete file directly in OS file system.* > > Zhiwen Sun > > > > On Mon, Mar 18, 2013 at 6:48 PM, Manish Bhoge <[EMAIL PROTECTED]>wrote: > >> I think these directories belong to task tracker temporary storage. I am >> not very confident to conclude that go ahead with your clean up. So, wait >> for similar or an expert's response >> >> Sent from HTC via Rocket! excuse typo. >> >> ------------------------------ >> * From: * Chunky Gupta <[EMAIL PROTECTED]>; >> * To: * <[EMAIL PROTECTED]>; >> * Subject: * dfs storage full on all slave machines of 6 machine hive >> cluster >> * Sent: * Mon, Mar 18, 2013 10:37:39 AM >> >> Hi, >> >> We have a 6 machine hive cluster. We are getting errors while a query is >> running and it fails. I found that on all 5 slaves storage is nearly full ( >> 96%, 98%, 100%, 97%, 98% storage used) . >> >> On my slaves machines, this folder "/mnt/hadoop-fs/dfs/data/current/" is >> contributing 95% storage used. It contains folders with names "subdir0", >> "subdir1", etc and under them there are many files with name like >> "blk_-4071357924681234567" and blk_-4071357924681234567_246813.meta:, etc. >> >> I want to delete these subdir folders but I am not sure if it will not >> affects the tables which I have created. >> >> Can anyone help me and tell me what are these folders used for ?. >> >> Thanks, >> Chunky. >> > >
-
Re: dfs storage full on all slave machines of 6 machine hive cluster
Alok Kumar 2013-03-18, 16:46
Look into your hdfs-site.xml & mapred-site.xml conf files.
*dfs.data.dir* propety contain your actual HDFS data path, better avoid deleting anything from these directories.
*mapred.local.dir* contains temporary map-reduce job data, you can clean this one.
"/mnt/hadoop-fs/dfs/data/current/" looks like your hdfs data path, this mean your hive tables have grown to ~95% of your disk size. try deleting hive tables or add more disk ( dropping a EXTERNAL hive table doesn't clear the data from HDFS)
Thanks,
On Mon, Mar 18, 2013 at 9:28 PM, Chunky Gupta <[EMAIL PROTECTED]>wrote:
> Hi Zhiwen, > > /mnt/hadoop-fs/mapred/local/taskTracker/ > > Inside this folder there are folders with different user name, can I > delete these ?. > > I do not understand what this {*nouserdir*} you were talking about, can > you please explain ?. > > Thanks, > Chunky. > > > > On Mon, Mar 18, 2013 at 8:40 PM, Zhiwen Sun <[EMAIL PROTECTED]> wrote: > >> The folder "/mnt/hadoop-fs/dfs/data/current/" is the main folder of >> datanode in hadoop. >> >> You can use *hadoop dfs -rmr {nouserdir} *to get more free space in HDFS. >> >> *Don't delete file directly in OS file system.* >> >> Zhiwen Sun >> >> >> >> On Mon, Mar 18, 2013 at 6:48 PM, Manish Bhoge <[EMAIL PROTECTED] >> > wrote: >> >>> I think these directories belong to task tracker temporary storage. I am >>> not very confident to conclude that go ahead with your clean up. So, wait >>> for similar or an expert's response >>> >>> Sent from HTC via Rocket! excuse typo. >>> >>> ------------------------------ >>> * From: * Chunky Gupta <[EMAIL PROTECTED]>; >>> * To: * <[EMAIL PROTECTED]>; >>> * Subject: * dfs storage full on all slave machines of 6 machine hive >>> cluster >>> * Sent: * Mon, Mar 18, 2013 10:37:39 AM >>> >>> Hi, >>> >>> We have a 6 machine hive cluster. We are getting errors while a query is >>> running and it fails. I found that on all 5 slaves storage is nearly full ( >>> 96%, 98%, 100%, 97%, 98% storage used) . >>> >>> On my slaves machines, this folder "/mnt/hadoop-fs/dfs/data/current/" is >>> contributing 95% storage used. It contains folders with names "subdir0", >>> "subdir1", etc and under them there are many files with name like >>> "blk_-4071357924681234567" and blk_-4071357924681234567_246813.meta:, etc. >>> >>> I want to delete these subdir folders but I am not sure if it will not >>> affects the tables which I have created. >>> >>> Can anyone help me and tell me what are these folders used for ?. >>> >>> Thanks, >>> Chunky. >>> >> >> > -- Alok Kumar
-
Re: dfs storage full on all slave machines of 6 machine hive cluster
Chunky Gupta 2013-03-19, 10:36
Thank Alok, I deleted mapred.local.dir folders. I more 2 question,
1. I have around 30 databases and each one contains many tables. So, is there any way to find out wat are the size of each database or how much storage a particular table in a database is occupying.
2. We have 5 slave nodes, how to find which tables data is stored on which slave node .
Thanks, Chunky. On Mon, Mar 18, 2013 at 10:16 PM, Alok Kumar <[EMAIL PROTECTED]> wrote:
> Look into your hdfs-site.xml & mapred-site.xml conf files. > > *dfs.data.dir* propety contain your actual HDFS data path, better avoid > deleting anything from these directories. > > *mapred.local.dir* contains temporary map-reduce job data, you can clean > this one. > > "/mnt/hadoop-fs/dfs/data/current/" looks like your hdfs data path, this > mean your hive tables have grown to ~95% of your disk size. try deleting > hive tables or add more disk ( dropping a EXTERNAL hive table doesn't clear > the data from HDFS) > > Thanks, > > > On Mon, Mar 18, 2013 at 9:28 PM, Chunky Gupta <[EMAIL PROTECTED]>wrote: > >> Hi Zhiwen, >> >> /mnt/hadoop-fs/mapred/local/taskTracker/ >> >> Inside this folder there are folders with different user name, can I >> delete these ?. >> >> I do not understand what this {*nouserdir*} you were talking about, can >> you please explain ?. >> >> Thanks, >> Chunky. >> >> >> >> On Mon, Mar 18, 2013 at 8:40 PM, Zhiwen Sun <[EMAIL PROTECTED]> wrote: >> >>> The folder "/mnt/hadoop-fs/dfs/data/current/" is the main folder of >>> datanode in hadoop. >>> >>> You can use *hadoop dfs -rmr {nouserdir} *to get more free space in >>> HDFS. >>> >>> *Don't delete file directly in OS file system.* >>> >>> Zhiwen Sun >>> >>> >>> >>> On Mon, Mar 18, 2013 at 6:48 PM, Manish Bhoge < >>> [EMAIL PROTECTED]> wrote: >>> >>>> I think these directories belong to task tracker temporary storage. I >>>> am not very confident to conclude that go ahead with your clean up. So, >>>> wait for similar or an expert's response >>>> >>>> Sent from HTC via Rocket! excuse typo. >>>> >>>> ------------------------------ >>>> * From: * Chunky Gupta <[EMAIL PROTECTED]>; >>>> * To: * <[EMAIL PROTECTED]>; >>>> * Subject: * dfs storage full on all slave machines of 6 machine hive >>>> cluster >>>> * Sent: * Mon, Mar 18, 2013 10:37:39 AM >>>> >>>> Hi, >>>> >>>> We have a 6 machine hive cluster. We are getting errors while a query >>>> is running and it fails. I found that on all 5 slaves storage is nearly >>>> full ( 96%, 98%, 100%, 97%, 98% storage used) . >>>> >>>> On my slaves machines, this folder "/mnt/hadoop-fs/dfs/data/current/" >>>> is contributing 95% storage used. It contains folders with names "subdir0", >>>> "subdir1", etc and under them there are many files with name like >>>> "blk_-4071357924681234567" and blk_-4071357924681234567_246813.meta:, etc. >>>> >>>> I want to delete these subdir folders but I am not sure if it will not >>>> affects the tables which I have created. >>>> >>>> Can anyone help me and tell me what are these folders used for ?. >>>> >>>> Thanks, >>>> Chunky. >>>> >>> >>> >> > > > -- > Alok Kumar >
-
Re: dfs storage full on all slave machines of 6 machine hive cluster
Alok Kumar 2013-03-19, 18:32
Hi,
1. For a table size in Hive : $hive> describe extended <*tableName*> ** look for *location* tag in output ** run this *bin/hadoop dfs -du <hive-table-location>* at $HADOOP_HOME directory to get the table size. (It won't hold true for EXTENDED tables; Also not sure about size of any database; )
2. HDFS stores data in distributed manner. It is difficult to get the actual block location. ( A single table data could be spread across all 5 nodes )
Thanks,
On Tue, Mar 19, 2013 at 4:06 PM, Chunky Gupta <[EMAIL PROTECTED]>wrote:
> Thank Alok, I deleted mapred.local.dir folders. > I more 2 question, > > 1. I have around 30 databases and each one contains many tables. So, is > there any way to find out wat are the size of each database or how much > storage a particular table in a database is occupying. > > 2. We have 5 slave nodes, how to find which tables data is stored on which > slave node . > > Thanks, > Chunky. > > > > On Mon, Mar 18, 2013 at 10:16 PM, Alok Kumar <[EMAIL PROTECTED]> wrote: > >> Look into your hdfs-site.xml & mapred-site.xml conf files. >> >> *dfs.data.dir* propety contain your actual HDFS data path, better avoid >> deleting anything from these directories. >> >> *mapred.local.dir* contains temporary map-reduce job data, you can clean >> this one. >> >> "/mnt/hadoop-fs/dfs/data/current/" looks like your hdfs data path, this >> mean your hive tables have grown to ~95% of your disk size. try deleting >> hive tables or add more disk ( dropping a EXTERNAL hive table doesn't clear >> the data from HDFS) >> >> Thanks, >> >> >> On Mon, Mar 18, 2013 at 9:28 PM, Chunky Gupta <[EMAIL PROTECTED]>wrote: >> >>> Hi Zhiwen, >>> >>> /mnt/hadoop-fs/mapred/local/taskTracker/ >>> >>> Inside this folder there are folders with different user name, can I >>> delete these ?. >>> >>> I do not understand what this {*nouserdir*} you were talking about, can >>> you please explain ?. >>> >>> Thanks, >>> Chunky. >>> >>> >>> >>> On Mon, Mar 18, 2013 at 8:40 PM, Zhiwen Sun <[EMAIL PROTECTED]> wrote: >>> >>>> The folder "/mnt/hadoop-fs/dfs/data/current/" is the main folder of >>>> datanode in hadoop. >>>> >>>> You can use *hadoop dfs -rmr {nouserdir} *to get more free space in >>>> HDFS. >>>> >>>> *Don't delete file directly in OS file system.* >>>> >>>> Zhiwen Sun >>>> >>>> >>>> >>>> On Mon, Mar 18, 2013 at 6:48 PM, Manish Bhoge < >>>> [EMAIL PROTECTED]> wrote: >>>> >>>>> I think these directories belong to task tracker temporary storage. I >>>>> am not very confident to conclude that go ahead with your clean up. So, >>>>> wait for similar or an expert's response >>>>> >>>>> Sent from HTC via Rocket! excuse typo. >>>>> >>>>> ------------------------------ >>>>> * From: * Chunky Gupta <[EMAIL PROTECTED]>; >>>>> * To: * <[EMAIL PROTECTED]>; >>>>> * Subject: * dfs storage full on all slave machines of 6 machine hive >>>>> cluster >>>>> * Sent: * Mon, Mar 18, 2013 10:37:39 AM >>>>> >>>>> Hi, >>>>> >>>>> We have a 6 machine hive cluster. We are getting errors while a query >>>>> is running and it fails. I found that on all 5 slaves storage is nearly >>>>> full ( 96%, 98%, 100%, 97%, 98% storage used) . >>>>> >>>>> On my slaves machines, this folder "/mnt/hadoop-fs/dfs/data/current/" >>>>> is contributing 95% storage used. It contains folders with names "subdir0", >>>>> "subdir1", etc and under them there are many files with name like >>>>> "blk_-4071357924681234567" and blk_-4071357924681234567_246813.meta:, etc. >>>>> >>>>> I want to delete these subdir folders but I am not sure if it will not >>>>> affects the tables which I have created. >>>>> >>>>> Can anyone help me and tell me what are these folders used for ?. >>>>> >>>>> Thanks, >>>>> Chunky. >>>>> >>>> >>>> >>> >> >> >> -- >> Alok Kumar >> > > -- Alok Kumar
|
|