Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - Bulk loading job failed when one region server went down in the cluster


+
anil gupta 2012-03-30, 23:05
+
Kevin Odell 2012-03-31, 01:05
+
anil gupta 2012-03-31, 01:24
+
Kevin Odell 2012-04-03, 14:34
+
anil gupta 2012-04-03, 16:12
+
anil gupta 2012-08-07, 17:59
+
Kevin Odell 2012-08-13, 13:51
+
Michael Segel 2012-08-13, 13:58
+
anil gupta 2012-08-13, 19:11
+
Michael Segel 2012-08-13, 19:39
+
anil gupta 2012-08-13, 20:14
+
anil gupta 2012-08-13, 20:24
+
Michael Segel 2012-08-14, 00:17
Copy link to this message
-
Re: Bulk loading job failed when one region server went down in the cluster
anil gupta 2012-08-14, 01:05
Hi Mike,

You hit the nail on the that i need to lower down the memory by setting
yarn.nodemanager.resource.memory-mb. Here's another major bug of YARN you
are talking about. I already tried setting that property to 1500 MB in
yarn-site.xml and  setting yarn.app.mapreduce.am.resource.mb to 1000 MB in
mapred-site.xml. If i do this change then the YARN job does not runs at all
even though the configuration is right. It's a bug and i have to file a
JIRA for it. So, i was only left with the option to let it run with
incorrect YARN conf since my objective is to load data into HBase rather
than playing with YARN. MapReduce is only used for bulk loading in my
cluster.

Here is a link to the mailing list email regarding running YARN with lesser
memory:
http://permalink.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/33164

It would be great if you can answer this simple question of mine: Is HBase
Bulk Loading fault tolerant to Region Server failures in a viable/decent
environment?

Thanks,
Anil Gupta

On Mon, Aug 13, 2012 at 5:17 PM, Michael Segel <[EMAIL PROTECTED]>wrote:

> Not sure why you're having an issue in getting an answer.
> Even if you're not a YARN expert,  google is your friend.
>
> See:
>
> http://books.google.com/books?id=Wu_xeGdU4G8C&pg=PA323&lpg=PA323&dq=Hadoop+YARN+setting+number+of+slots&source=bl&ots=i7xQYwQf-u&sig=ceuDmiOkbqTqok_HfIr3udvm6C0&hl=en&sa=X&ei=8JYpUNeZJMnxygGzqIGwCw&ved=0CEQQ6AEwAQ#v=onepage&q=Hadoop%20YARN%20setting%20number%20of%20slots&f=false
>
> This is a web page from Tom White's 3rd Edition.
>
> The bottom line...
> -=-
> The considerations for how much memory to dedicate to a node manager for
> running containers are similar to the those discussed in
>
> “Memory” on page 307. Each Hadoop daemon uses 1,000 MB, so for a datanode
> and a node manager, the total is 2,000 MB. Set aside enough for other
> processes that are running on the machine, and the remainder can be
> dedicated to the node manager’s containers by setting the configuration
> property yarn.nodemanager.resource.memory-mb to the total allocation in MB.
> (The default is 8,192 MB.)
> -=-
>
> Taken per fair use. Page 323
>
> As you can see you need to drop this down to something like 1GB if you
> even have enough memory for that.
> Again set yarn.nodemanager.resource.memory-mb to a more realistic value.
>
> 8GB on a 3 GB node? Yeah that would really hose you, especially if you're
> trying to run HBase too.
>
> Even here... You really don't have enough memory to do it all. (Maybe
> enough to do a small test)
>
>
>
> Good luck.
>
> On Aug 13, 2012, at 3:24 PM, anil gupta <[EMAIL PROTECTED]> wrote:
>
>
> > Hi Mike,
> >
> > Here is the link to my email on Hadoop list regarding YARN problem:
> >
> http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201208.mbox/%3CCAF1+Vs8oF4VsHbg14B7SGzBB_8Ty7GC9Lw3nm1bM0v+[EMAIL PROTECTED]%3E
> >
> > Somehow the link for cloudera mail in last email does not seems to work.
> > Here is the new link:
> >
> https://groups.google.com/a/cloudera.org/forum/?fromgroups#!searchin/cdh-user/yarn$20anil/cdh-user/J564g9A8tPE/ZpslzOkIGZYJ%5B1-25%5D
> >
> > Thanks for your help,
> > Anil Gupta
> >
> > On Mon, Aug 13, 2012 at 1:14 PM, anil gupta <[EMAIL PROTECTED]>
> wrote:
> >
> >> Hi Mike,
> >>
> >> I tried doing that by setting up properties in mapred-site.xml but Yarn
> >> doesnt seems to work with "mapreduce.tasktracker.
> >> map.tasks.maximum" property. Here is a reference to a discussion to same
> >> problem:
> >>
> >>
> https://groups.google.com/a/cloudera.org/forum/?fromgroups#!searchin/cdh-user/yarn$20anil/cdh-user/J564g9A8tPE/ZpslzOkIGZYJ[1-25]
> >> I have also posted about the same problem in Hadoop mailing list.
> >>
> >> I already admitted in my previous email that YARN is having major issues
> >> when we want to control it in low memory environment. I was just trying
> to
> >> get views HBase experts on bulk load failures since we will be relying
> >> heavily on Fault Tolerance.
Thanks & Regards,
Anil Gupta
+
Michael Segel 2012-08-14, 01:59
+
Stack 2012-08-15, 21:52
+
anil gupta 2012-08-15, 22:13