Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Re: Application errors with one disk on datanode getting filled up to 100%


Copy link to this message
-
Re: Application errors with one disk on datanode getting filled up to 100%
Thanks Sandeep.
Yes , thats correct , I was more interested to know about the uneven
distribution within the DN.

Thanks,
Rahul
On Fri, Jun 14, 2013 at 6:12 PM, Sandeep L <[EMAIL PROTECTED]>wrote:

> Rahul,
>
> In general most of the times Hadoop tries to compute data locally that is,
> if run a MapReduce task on particular input,
> Hadoop will try compute data locally and write data locally(Majority of
> times this will happen), replicate in other nodes.
>
> In your scenario majority of your input data may be from a single
> datanode, so Hadoop is trying to write output data to same datanode.
>
> Thanks,
> Sandeep.
> ------------------------------
> From: [EMAIL PROTECTED]
> Date: Fri, 14 Jun 2013 17:50:46 +0530
>
> Subject: Re: Application errors with one disk on datanode getting filled
> up to 100%
> To: [EMAIL PROTECTED]
>
>
> Thanks Sandeep,
>
> I was thinking that the overall hdfs cluster might get unbalanced over the
> time and balancer might be useful in that case.
> I was more interested to know why only one disk out of configured 4 disks
> of the DN is getting all the writes.As per whatever I have read , writes
> should be in round robin fashion , which should ideally lead to all the
> configured disks in the DN to be similarly loaded.
>
> Not sure how the balancer is fixing this issue.
>
> Rgds,
> Rahul
>
>
>
> On Fri, Jun 14, 2013 at 4:45 PM, Sandeep L <[EMAIL PROTECTED]>wrote:
>
> Rahul,
>
> In general this issue happens some times in Hadoop. There is no exact
> reason for this.
> To mitigate this you need to run balancer in regular intervals.
>
> Thanks,
> Sandeep.
>
> ------------------------------
> Date: Fri, 14 Jun 2013 16:39:02 +0530
> Subject: Re: Application errors with one disk on datanode getting filled
> up to 100%
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
>
>
> No, as of this moment we've no ideas about the reasons for that behavior.
>
>
> On Fri, Jun 14, 2013 at 4:04 PM, Rahul Bhattacharjee <
> [EMAIL PROTECTED]> wrote:
>
> Thanks Mayank, Any clue on why was only one disk was getting all writes.
>
> Rahul
>
>
> On Thu, Jun 13, 2013 at 11:47 AM, Mayank <[EMAIL PROTECTED]> wrote:
>
> So we did a manual rebalance (followed instructions at:
> http://wiki.apache.org/hadoop/FAQ#On_an_individual_data_node.2C_how_do_you_balance_the_blocks_on_the_disk.3F)
> and also reserved 30 GB of space for non dfs usage via
> dfs.datanode.du.reserved and restarted our apps.
>
> Things have been going fine till now.
>
> Keeping fingers crossed :)
>
>
> On Wed, Jun 12, 2013 at 12:58 PM, Rahul Bhattacharjee <
> [EMAIL PROTECTED]> wrote:
>
> I have a few points to make , these may not be very helpful for the said
> problem.
>
> +All data nodes are bad exception is kind of not pointing to the problem
> related to disk space full.
> +hadoop.tmp.dir acts as base location of other hadoop related properties ,
> not sure if any particular directory is created specifically.
> +Only one disk getting filled looks strange.The other disk are part while
> formatting the NN.
>
> Would be interesting to know the reason for this.
> Please keep posted.
>
> Thanks,
> Rahul
>
>
> On Mon, Jun 10, 2013 at 3:39 PM, Nitin Pawar <[EMAIL PROTECTED]>wrote:
>
> From the snapshot, you got around 3TB for writing data.
>
> Can you check individual datanode's storage health.
> As you said you got 80 servers writing parallely to hdfs, I am not sure
> can that be an issue.
> As suggested in past threads, you can do a rebalance of the blocks but
> that will take some time to finish and will not solve your issue right
> away.
>
> You can wait for others to reply. I am sure there will be far better
> solutions from experts for this.
>
>
> On Mon, Jun 10, 2013 at 3:18 PM, Mayank <[EMAIL PROTECTED]> wrote:
>
> No it's not a map-reduce job. We've a java app running on around 80
> machines which writes to hdfs. The error that I'd mentioned is being thrown
> by the application and yes we've replication factor set to 3 and following
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB