Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Bulk loading job failed when one region server went down in the cluster

Copy link to this message
Re: Bulk loading job failed when one region server went down in the cluster
Not sure why you're having an issue in getting an answer.
Even if you're not a YARN expert,  google is your friend.


This is a web page from Tom White's 3rd Edition.

The bottom line...
The considerations for how much memory to dedicate to a node manager for running containers are similar to the those discussed in

“Memory” on page 307. Each Hadoop daemon uses 1,000 MB, so for a datanode and a node manager, the total is 2,000 MB. Set aside enough for other processes that are running on the machine, and the remainder can be dedicated to the node manager’s containers by setting the configuration property yarn.nodemanager.resource.memory-mb to the total allocation in MB. (The default is 8,192 MB.)

Taken per fair use. Page 323

As you can see you need to drop this down to something like 1GB if you even have enough memory for that.
Again set yarn.nodemanager.resource.memory-mb to a more realistic value.

8GB on a 3 GB node? Yeah that would really hose you, especially if you're trying to run HBase too.

Even here... You really don't have enough memory to do it all. (Maybe enough to do a small test)

Good luck.

On Aug 13, 2012, at 3:24 PM, anil gupta <[EMAIL PROTECTED]> wrote:
> Hi Mike,
> Here is the link to my email on Hadoop list regarding YARN problem:
> http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201208.mbox/%3CCAF1+Vs8oF4VsHbg14B7SGzBB_8Ty7GC9Lw3nm1bM0v+[EMAIL PROTECTED]%3E
> Somehow the link for cloudera mail in last email does not seems to work.
> Here is the new link:
> https://groups.google.com/a/cloudera.org/forum/?fromgroups#!searchin/cdh-user/yarn$20anil/cdh-user/J564g9A8tPE/ZpslzOkIGZYJ%5B1-25%5D
> Thanks for your help,
> Anil Gupta
> On Mon, Aug 13, 2012 at 1:14 PM, anil gupta <[EMAIL PROTECTED]> wrote:
>> Hi Mike,
>> I tried doing that by setting up properties in mapred-site.xml but Yarn
>> doesnt seems to work with "mapreduce.tasktracker.
>> map.tasks.maximum" property. Here is a reference to a discussion to same
>> problem:
>> https://groups.google.com/a/cloudera.org/forum/?fromgroups#!searchin/cdh-user/yarn$20anil/cdh-user/J564g9A8tPE/ZpslzOkIGZYJ[1-25]
>> I have also posted about the same problem in Hadoop mailing list.
>> I already admitted in my previous email that YARN is having major issues
>> when we want to control it in low memory environment. I was just trying to
>> get views HBase experts on bulk load failures since we will be relying
>> heavily on Fault Tolerance.
>> If HBase Bulk Loader is fault tolerant to failure of RS in a viable
>> environment  then I dont have any issue. I hope this clears up my purpose
>> of posting on this topic.
>> Thanks,
>> Anil
>> On Mon, Aug 13, 2012 at 12:39 PM, Michael Segel <[EMAIL PROTECTED]
>>> wrote:
>>> Anil,
>>> Do you know what happens when you have an airplane that has too heavy a
>>> cargo when it tries to take off?
>>> You run out of runway and you crash and burn.
>>> Looking at your post, why are you starting 8 map processes on each slave?
>>> That's tunable and you clearly do not have enough memory in each VM to
>>> support 8 slots on a node.
>>> Here you swap, you swap you cause HBase to crash and burn.
>>> 3.2GB of memory means that no more than 1 slot per slave and even then...
>>> you're going to be very tight. Not to mention that you will need to loosen
>>> up on your timings since its all virtual and you have way too much i/o per
>>> drive going on.
>>> My suggestion is that you go back and tune your system before thinking
>>> about running anything.
>>> HTH
>>> -Mike
>>> On Aug 13, 2012, at 2:11 PM, anil gupta <[EMAIL PROTECTED]> wrote: