Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> Re: Application errors with one disk on datanode getting filled up to 100%


+
Rahul Bhattacharjee 2013-06-12, 07:28
+
Mayank 2013-06-13, 06:17
+
Mayank 2013-06-14, 11:09
+
Sandeep L 2013-06-14, 11:15
+
Sandeep L 2013-06-14, 12:42
Copy link to this message
-
Re: Application errors with one disk on datanode getting filled up to 100%
Thanks Sandeep.
Yes , thats correct , I was more interested to know about the uneven
distribution within the DN.

Thanks,
Rahul
On Fri, Jun 14, 2013 at 6:12 PM, Sandeep L <[EMAIL PROTECTED]>wrote:

> Rahul,
>
> In general most of the times Hadoop tries to compute data locally that is,
> if run a MapReduce task on particular input,
> Hadoop will try compute data locally and write data locally(Majority of
> times this will happen), replicate in other nodes.
>
> In your scenario majority of your input data may be from a single
> datanode, so Hadoop is trying to write output data to same datanode.
>
> Thanks,
> Sandeep.
> ------------------------------
> From: [EMAIL PROTECTED]
> Date: Fri, 14 Jun 2013 17:50:46 +0530
>
> Subject: Re: Application errors with one disk on datanode getting filled
> up to 100%
> To: [EMAIL PROTECTED]
>
>
> Thanks Sandeep,
>
> I was thinking that the overall hdfs cluster might get unbalanced over the
> time and balancer might be useful in that case.
> I was more interested to know why only one disk out of configured 4 disks
> of the DN is getting all the writes.As per whatever I have read , writes
> should be in round robin fashion , which should ideally lead to all the
> configured disks in the DN to be similarly loaded.
>
> Not sure how the balancer is fixing this issue.
>
> Rgds,
> Rahul
>
>
>
> On Fri, Jun 14, 2013 at 4:45 PM, Sandeep L <[EMAIL PROTECTED]>wrote:
>
> Rahul,
>
> In general this issue happens some times in Hadoop. There is no exact
> reason for this.
> To mitigate this you need to run balancer in regular intervals.
>
> Thanks,
> Sandeep.
>
> ------------------------------
> Date: Fri, 14 Jun 2013 16:39:02 +0530
> Subject: Re: Application errors with one disk on datanode getting filled
> up to 100%
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
>
>
> No, as of this moment we've no ideas about the reasons for that behavior.
>
>
> On Fri, Jun 14, 2013 at 4:04 PM, Rahul Bhattacharjee <
> [EMAIL PROTECTED]> wrote:
>
> Thanks Mayank, Any clue on why was only one disk was getting all writes.
>
> Rahul
>
>
> On Thu, Jun 13, 2013 at 11:47 AM, Mayank <[EMAIL PROTECTED]> wrote:
>
> So we did a manual rebalance (followed instructions at:
> http://wiki.apache.org/hadoop/FAQ#On_an_individual_data_node.2C_how_do_you_balance_the_blocks_on_the_disk.3F)
> and also reserved 30 GB of space for non dfs usage via
> dfs.datanode.du.reserved and restarted our apps.
>
> Things have been going fine till now.
>
> Keeping fingers crossed :)
>
>
> On Wed, Jun 12, 2013 at 12:58 PM, Rahul Bhattacharjee <
> [EMAIL PROTECTED]> wrote:
>
> I have a few points to make , these may not be very helpful for the said
> problem.
>
> +All data nodes are bad exception is kind of not pointing to the problem
> related to disk space full.
> +hadoop.tmp.dir acts as base location of other hadoop related properties ,
> not sure if any particular directory is created specifically.
> +Only one disk getting filled looks strange.The other disk are part while
> formatting the NN.
>
> Would be interesting to know the reason for this.
> Please keep posted.
>
> Thanks,
> Rahul
>
>
> On Mon, Jun 10, 2013 at 3:39 PM, Nitin Pawar <[EMAIL PROTECTED]>wrote:
>
> From the snapshot, you got around 3TB for writing data.
>
> Can you check individual datanode's storage health.
> As you said you got 80 servers writing parallely to hdfs, I am not sure
> can that be an issue.
> As suggested in past threads, you can do a rebalance of the blocks but
> that will take some time to finish and will not solve your issue right
> away.
>
> You can wait for others to reply. I am sure there will be far better
> solutions from experts for this.
>
>
> On Mon, Jun 10, 2013 at 3:18 PM, Mayank <[EMAIL PROTECTED]> wrote:
>
> No it's not a map-reduce job. We've a java app running on around 80
> machines which writes to hdfs. The error that I'd mentioned is being thrown
> by the application and yes we've replication factor set to 3 and following
+
Rahul Bhattacharjee 2013-06-14, 12:36