|
Jay Wilson
2012-07-01, 21:05
Mohammad Tariq
2012-07-01, 23:14
yuzhihong@...
2012-07-01, 23:21
Andrew Purtell
2012-07-02, 00:41
Michael Segel
2012-07-02, 05:11
Lars George
2012-07-02, 07:41
Stack
2012-07-02, 07:51
Michael Segel
2012-07-02, 11:40
Mohammad Tariq
2012-07-02, 11:53
Michael Segel
2012-07-02, 14:46
Andrew Purtell
2012-07-02, 18:13
Amandeep Khurana
2012-07-02, 18:25
Jay Wilson
2012-07-02, 20:12
Suraj Varma
2012-07-02, 21:43
Jay Wilson
2012-07-02, 22:55
Suraj Varma
2012-07-02, 23:43
Jay Wilson
2012-07-03, 00:51
Suraj Varma
2012-07-03, 01:13
|
-
HBASE -- Regionserver and QuorumPeer ?Jay Wilson 2012-07-01, 21:05
Can a regionserver and quorumpeer reside on the same node?
-
Re: HBASE -- Regionserver and QuorumPeer ?Mohammad Tariq 2012-07-01, 23:14
Not necessarily...Both are totally different processes..In a Hadoop
cluster typically HBase Master and a ZooKeeper quorum peer run on a machine and regionservers are spread across the cluster. But this totally depends on you. Regards, Mohammad Tariq On Mon, Jul 2, 2012 at 2:35 AM, Jay Wilson <[EMAIL PROTECTED]> wrote: > Can a regionserver and quorumpeer reside on the same node? > > >
-
Re: HBASE -- Regionserver and QuorumPeer ?yuzhihong@... 2012-07-01, 23:21
Yes.
On Jul 1, 2012, at 2:05 PM, Jay Wilson <[EMAIL PROTECTED]> wrote: > Can a regionserver and quorumpeer reside on the same node? > > >
-
Re: HBASE -- Regionserver and QuorumPeer ?Andrew Purtell 2012-07-02, 00:41
On Sun, Jul 1, 2012 at 2:05 PM, Jay Wilson
<[EMAIL PROTECTED]> wrote: > Can a regionserver and quorumpeer reside on the same node? It can, but you want to consider how disk is allocated in the cluster. A typical and recommended configuration is HBase RegionServer and HDFS DataNode colocated on the nodes. The DataNode will use locally attached disk to store and serve blocks. A ZooKeeper quorum peer must record transactions to its local log before acking writes as part of the agreement protocol. Therefore you will want to dedicate a storage device for this independent of other use to minimize latency. If you are also putting a quorum peer aside a RegionServer aside a DataNode, then you lose one block device. Otherwise, during periods of heavy filesystem I/O the latency of ZooKeeper writes may become quite large. Often heavy filesystem I/O and ZooKeeper write demand coincide with HBase region transitions or node failure recovery, so you are impacted most by this when you least would want to be. IMO, it is better to run a separate ZooKeeper ensemble, point HBase to it, and then it is also available as an independent coordination service for your applications because HBase use of it will mostly be light. Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
-
Re: HBASE -- Regionserver and QuorumPeer ?Michael Segel 2012-07-02, 05:11
I'm sorry I'm losing it.
Running RS on a machine where DN isn't running? So then the RS can't store its regions locally. Not sure if that would ever be a good idea or recommended. Thought the initial question is running ZK on the same node as a RS which isn't a good idea and a recipe for failure.... Following KISS is a much better way of life than taking Crystal Meth. Its one way to avoid those nasty 'dead hooker problems'. * *<rant> <explanation> Just to explain KISS and what I mean by a 'dead hooker' problem... KISS = Keep It Simple Stupid This is an engineering principle used to teach engineering students that the best solutions are the ones that are straight forward and that if you attempt to get too clever, you always get some sort of blow back in your face. It usually hurts and its always self inflicted. 'dead hooker problems' - are the theoretical problems of how to get rid of the dead hooker from your hotel room after your party of Hookers, Booze and either Crystal Meth or Cocaine goes terribly wrong and you wake up the next morning with a nasty hangover and a dead body that you have to get out of your hotel room before the cleaning ladies come knocking on your hotel room door. While I've never experienced this... I can't recall how many movies have this as a plot or sub plot. Not that I'm attempting to advocate drugs or killing hookers, unless its with a type writer or text editor when you want to write your next failed movie script. </explanation> So here's my rant... I'm not picking on the OP, but in general there's a class of posts where the OP starts a thread by ignoring the common wisdom captured in books, blogs and Apache wikis when setting up a cluster. When things don't work, they ultimately post here and wonder why they don't work. The key to happiness is to not ignore the conventional wisdom and when starting out with Hadoop, follow the suggested set ups. Remember that the key is to first grok Hadoop before you attempt and doing more advanced things in terms of cluster configurations. That is what is meant by KISS. Accept that Hadoop is just a tool used by many to solve problems requiring a parallel framework. Dead Hooker problems may be a great plot device, but in real life, when under a time crunch, they are something one should avoid. ;-) </rant> For those of you who don't appreciate my sense of humor, try another example... (Also note... I don't know how this will translate to another language other than English so the meaning of this could be lost in translation...) Your wife has invited a bunch of her co-workers, including her boss, over for a dinner. You, being the good spouse are responsible for some of the meal prep. Rather than go with a tried and true recipe, you decide to try something new. And not only try a new recipe, you also decide to improvise and try new ingredients and do your own thing. Not really a good idea, and unless you are incredibly lucky, or a really good cook with a talent for creating new recipes, you are more than likely going to end up in the dog house. Take it from a guy who usually lives in the dog house for one reason or another... following the recipes and not trying something new when the pressure for success is on... much less stress in your life. :-) Again, with respect to Hadoop, there are a lot of moving parts where things can go wrong. I've got this drinking buddy named Murphy... you know the guy, he wrote this law... ;-) HTH -Mikey On Jul 1, 2012, at 7:41 PM, Andrew Purtell wrote: > A typical and recommended configuration is HBase RegionServer and HDFS > DataNode colocated on the nodes. The DataNode will use locally > attached disk to store and serve blocks.
-
Re: HBASE -- Regionserver and QuorumPeer ?Lars George 2012-07-02, 07:41
Hi Mike,
> Running RS on a machine where DN isn't running? I am not following here. Andy said that both are on the same node. Where in this thread did someone imply something else? Just curious. Cheers, Lars On Jul 2, 2012, at 7:11 AM, Michael Segel wrote: > I'm sorry I'm losing it. > > Running RS on a machine where DN isn't running? > So then the RS can't store its regions locally. Not sure if that would ever be a good idea or recommended. > > Thought the initial question is running ZK on the same node as a RS which isn't a good idea and a recipe for failure.... > > Following KISS is a much better way of life than taking Crystal Meth. Its one way to avoid those nasty 'dead hooker problems'. * > > *<rant> > <explanation> > Just to explain KISS and what I mean by a 'dead hooker' problem... > > KISS = Keep It Simple Stupid > This is an engineering principle used to teach engineering students that the best solutions are the ones that are straight forward and that if you attempt to get too clever, you always get some sort of blow back in your face. It usually hurts and its always self inflicted. > > 'dead hooker problems' - are the theoretical problems of how to get rid of the dead hooker from your hotel room after your party of Hookers, Booze and either Crystal Meth or Cocaine goes terribly wrong and you wake up the next morning with a nasty hangover and a dead body that you have to get out of your hotel room before the cleaning ladies come knocking on your hotel room door. While I've never experienced this... I can't recall how many movies have this as a plot or sub plot. > > Not that I'm attempting to advocate drugs or killing hookers, unless its with a type writer or text editor when you want to write your next failed movie script. > </explanation> > > So here's my rant... > > I'm not picking on the OP, but in general there's a class of posts where the OP starts a thread by ignoring the common wisdom captured in books, blogs and Apache wikis when setting up a cluster. > > When things don't work, they ultimately post here and wonder why they don't work. > > The key to happiness is to not ignore the conventional wisdom and when starting out with Hadoop, follow the suggested set ups. Remember that the key is to first grok Hadoop before you attempt and doing more advanced things in terms of cluster configurations. That is what is meant by KISS. Accept that Hadoop is just a tool used by many to solve problems requiring a parallel framework. > > Dead Hooker problems may be a great plot device, but in real life, when under a time crunch, they are something one should avoid. ;-) > > </rant> > > For those of you who don't appreciate my sense of humor, try another example... (Also note... I don't know how this will translate to another language other than English so the meaning of this could be lost in translation...) > > Your wife has invited a bunch of her co-workers, including her boss, over for a dinner. You, being the good spouse are responsible for some of the meal prep. Rather than go with a tried and true recipe, you decide to try something new. And not only try a new recipe, you also decide to improvise and try new ingredients and do your own thing. Not really a good idea, and unless you are incredibly lucky, or a really good cook with a talent for creating new recipes, you are more than likely going to end up in the dog house. > > Take it from a guy who usually lives in the dog house for one reason or another... following the recipes and not trying something new when the pressure for success is on... much less stress in your life. :-) > > Again, with respect to Hadoop, there are a lot of moving parts where things can go wrong. I've got this drinking buddy named Murphy... you know the guy, he wrote this law... ;-) > > HTH > > -Mikey > > > > On Jul 1, 2012, at 7:41 PM, Andrew Purtell wrote: > >> A typical and recommended configuration is HBase RegionServer and HDFS
-
Re: HBASE -- Regionserver and QuorumPeer ?Stack 2012-07-02, 07:51
On Mon, Jul 2, 2012 at 7:11 AM, Michael Segel <[EMAIL PROTECTED]> wrote:
> I'm sorry I'm losing it. > Its plain. Do us a favor and try keeping your psychotic breakdown to yourself going forward. St.Ack
-
Re: HBASE -- Regionserver and QuorumPeer ?Michael Segel 2012-07-02, 11:40
Sorry St. Ack,
Which is why I said that I was losing it... The entire quote was... "On Sun, Jul 1, 2012 at 2:05 PM, Jay Wilson <[EMAIL PROTECTED]> wrote: > Can a regionserver and quorumpeer reside on the same node? It can, but you want to consider how disk is allocated in the cluster. A typical and recommended configuration is HBase RegionServer and HDFS DataNode colocated on the nodes. The DataNode will use locally attached disk to store and serve blocks. " Looking at and parsing this you have two things... 1) When reading the 'A typical and recommended configuration...' can imply that its possible while not recommended to try and run an HBase RS while not running a DN service on the same node. 2) "It can, but you want to consider how disk is allocated in the cluster." While on a single machine running as a pseudo cluster is one thing, running a fully distributed cluster is another. I am not finding fault with what Andy was saying. The problem is that we tend not to use stronger language when discussing these topics. And my point wasn't just on this topic but others posts where we say 'not a good idea' yet someone still pursues the idea until there's a chorus of saying not to do something. I'm not faulting the poster because he wasn't and isn't the only one who does this... We see it all the time where someone goes down the wrong path, and is looking for a quick solution, rather than following the recommendation. Now I'm not sure if my KISS statement or my 'dead hooker' analogy or my jokes about drugs. KISS, I guess goes back to when I first learned that term. It was a 200 level Engineering graphics course where the instructor mentioned KISS and then stalled on the second S (KIS == Keep it Simple) and used the term 'Stupid' to refer back to the engineer who didn't keep it simple. Of course he was the same Professor who couldn't figure out an algorithm without using a GOTO statement and got huffy when I made the mistake of correcting him in class. (But that's another story.) Not sure if it should be KIS or if the second S in KISS was for something else. The 'dead hooker' analogy goes back to watching movie plots and subplots where the hero wakes up next to a body of a dead woman in bed. While in James Bond films its the evil turned good hottie that gets it, I was thinking back to the Cameron Diaz flick 'Very Bad Things' - 1998 movie where the plot line is based on a prostitute getting killed at a bachelor party. Also for some reason the movie Barton Fink comes to mind, or the Great Gatsby. And while I don't advocate drugs, that too is a reference to movies. Its the whole 'Airplane' spoofs where Lloyd Bridges talks about how today was a bad day for giving up <insert your favorite drug> ... Sorry to side track but I thought I'd give a more detailed explanation ... On Jul 2, 2012, at 2:51 AM, Stack wrote: > On Mon, Jul 2, 2012 at 7:11 AM, Michael Segel <[EMAIL PROTECTED]> wrote: >> I'm sorry I'm losing it. >> > > Its plain. Do us a favor and try keeping your psychotic breakdown to > yourself going forward. > > St.Ack >
-
Re: HBASE -- Regionserver and QuorumPeer ?Mohammad Tariq 2012-07-02, 11:53
What kind of explanation is this???????????
Regards, Mohammad Tariq On Mon, Jul 2, 2012 at 5:10 PM, Michael Segel <[EMAIL PROTECTED]> wrote: > Sorry St. Ack, > > Which is why I said that I was losing it... > > The entire quote was... > "On Sun, Jul 1, 2012 at 2:05 PM, Jay Wilson > <[EMAIL PROTECTED]> wrote: >> Can a regionserver and quorumpeer reside on the same node? > > It can, but you want to consider how disk is allocated in the cluster. > > A typical and recommended configuration is HBase RegionServer and HDFS > DataNode colocated on the nodes. The DataNode will use locally > attached disk to store and serve blocks. > " > > Looking at and parsing this you have two things... > > 1) When reading the 'A typical and recommended configuration...' can imply that its possible while not recommended to try and run an HBase RS while not running a DN service on the same node. > > 2) "It can, but you want to consider how disk is allocated in the cluster." > While on a single machine running as a pseudo cluster is one thing, running a fully distributed cluster is another. > > > I am not finding fault with what Andy was saying. The problem is that we tend not to use stronger language when discussing these topics. And my point wasn't just on this topic but others posts where we say 'not a good idea' yet someone still pursues the idea until there's a chorus of saying not to do something. I'm not faulting the poster because he wasn't and isn't the only one who does this... We see it all the time where someone goes down the wrong path, and is looking for a quick solution, rather than following the recommendation. > > Now I'm not sure if my KISS statement or my 'dead hooker' analogy or my jokes about drugs. > > KISS, I guess goes back to when I first learned that term. It was a 200 level Engineering graphics course where the instructor mentioned KISS and then stalled on the second S (KIS == Keep it Simple) and used the term 'Stupid' to refer back to the engineer who didn't keep it simple. Of course he was the same Professor who couldn't figure out an algorithm without using a GOTO statement and got huffy when I made the mistake of correcting him in class. (But that's another story.) Not sure if it should be KIS or if the second S in KISS was for something else. > > The 'dead hooker' analogy goes back to watching movie plots and subplots where the hero wakes up next to a body of a dead woman in bed. While in James Bond films its the evil turned good hottie that gets it, I was thinking back to the Cameron Diaz flick 'Very Bad Things' - 1998 movie where the plot line is based on a prostitute getting killed at a bachelor party. Also for some reason the movie Barton Fink comes to mind, or the Great Gatsby. > > And while I don't advocate drugs, that too is a reference to movies. Its the whole 'Airplane' spoofs where Lloyd Bridges talks about how today was a bad day for giving up <insert your favorite drug> ... > > Sorry to side track but I thought I'd give a more detailed explanation ... > > > On Jul 2, 2012, at 2:51 AM, Stack wrote: > >> On Mon, Jul 2, 2012 at 7:11 AM, Michael Segel <[EMAIL PROTECTED]> wrote: >>> I'm sorry I'm losing it. >>> >> >> Its plain. Do us a favor and try keeping your psychotic breakdown to >> yourself going forward. >> >> St.Ack >> >
-
Re: HBASE -- Regionserver and QuorumPeer ?Michael Segel 2012-07-02, 14:46
Well...
I wasn't sure if St.Ack was displeased by my comments on Andrew's response, or my references to KISS where the second S is stupid, reference to 'dead hookers' or reference to drugs. I was just covering my bases. :-) With respect to Andrew's response, I saw something that I wasn't sure if I was reading too much in to his response. Hence my start with that I may be losing it because I was probably reading something in to his response that he may not have intended. I guess its a problem many of us have, myself included, where we are sometimes intentionally vague in our response. There are times when someone asks a question, the response is that they shouldn't do X, that while its not a good idea to do something, its still theoretically possible to do. In this case running a RS and ZK on the same node. Yes, it could be done with the proper configuration where you isolate your disk I/O as much as possible between ZK and the RS. However the better solution is to run the ZK along with the JT, NN, HM and even SN on the same node. (For a small dev cluster.) Another case in point is that we see things taken out of context. As an example, there was a presentation by Facebook I think... where they run their HBase on nodes where they don't run TT. In context, this could make a lot of sense when they are using HBase to deliver real time response to an app outside of the cluster, and are not using it as part of a M/R job. The problem is that someone sees this and takes it out of context saying that FB does it and the best way to run HBase is to not run it on the same nodes you have TT running. (Data Locality? Forget about it...) Note I don't believe that this is what the FB presentation was suggesting except in their specific solution. In another thread , someone was asking for help because they were having problems with their cluster. One node was in India, Two were in the US. The response was along the idea that its not a good thing to do this. While I agree with the response, I have to wonder if it shouldn't have been worded more strongly. We aren't saying it can't be done, we're saying that its not something we'd recommend. I don't know if that's a strong enough response to really discourage an OP from actually doing it. Is that a better explanation? On Jul 2, 2012, at 6:53 AM, Mohammad Tariq wrote: > What kind of explanation is this??????????? > > Regards, > Mohammad Tariq > > > On Mon, Jul 2, 2012 at 5:10 PM, Michael Segel <[EMAIL PROTECTED]> wrote: >> Sorry St. Ack, >> >> Which is why I said that I was losing it... >> >> The entire quote was... >> "On Sun, Jul 1, 2012 at 2:05 PM, Jay Wilson >> <[EMAIL PROTECTED]> wrote: >>> Can a regionserver and quorumpeer reside on the same node? >> >> It can, but you want to consider how disk is allocated in the cluster. >> >> A typical and recommended configuration is HBase RegionServer and HDFS >> DataNode colocated on the nodes. The DataNode will use locally >> attached disk to store and serve blocks. >> " >> >> Looking at and parsing this you have two things... >> >> 1) When reading the 'A typical and recommended configuration...' can imply that its possible while not recommended to try and run an HBase RS while not running a DN service on the same node. >> >> 2) "It can, but you want to consider how disk is allocated in the cluster." >> While on a single machine running as a pseudo cluster is one thing, running a fully distributed cluster is another. >> >> >> I am not finding fault with what Andy was saying. The problem is that we tend not to use stronger language when discussing these topics. And my point wasn't just on this topic but others posts where we say 'not a good idea' yet someone still pursues the idea until there's a chorus of saying not to do something. I'm not faulting the poster because he wasn't and isn't the only one who does this... We see it all the time where someone goes down the wrong path, and is looking for a quick solution, rather than following the recommendation.
-
Re: HBASE -- Regionserver and QuorumPeer ?Andrew Purtell 2012-07-02, 18:13
On Mon, Jul 2, 2012 at 4:40 AM, Michael Segel <[EMAIL PROTECTED]> wrote:
> I am not finding fault with what Andy was saying. The problem is that we tend not to use stronger language when discussing these topics. And my point wasn't just on this topic but others posts where we say 'not a good idea' yet someone still pursues the idea until there's a chorus of saying not to do something. Hey Michael your point about weasel words is not without merit. However, I try to limit my use of strong language because I know I have a tendency toward strong opinions. (On some days I am more successful than others.) I find it is generally not appropriate to take a strong tone with users, the impression it leaves is that you are an asshole, and your community by extension. In this thread I think you are suffering that very effect. Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
-
Re: HBASE -- Regionserver and QuorumPeer ?Amandeep Khurana 2012-07-02, 18:25
As someone who has been developing/running/using the software for a longer period of time than the person who is asking the question, you can best serve the poser by making them aware of the trade offs and why it's a good/bad idea to do things a certain way. At the end of the day, it's their choice to make based on their requirements and constraints.
Having said that, it'll be really nice to stop this thread from becoming more about how to answer questions rather than answering the question itself. Bringing the thread back to track: Jay, you can certainly run zookeepers with the Datanodes and Region Server processes. The issue there (as highlighted by Andy earlier) is that you will likely load up the machine (primarily due to I/O) which will cause ZK some grief. It is generally recommended to collocate in the following groups: Datanode + Region Servers on the same physical nodes Zookeeper and HBase Master on the same physical nodes (make sure to give ZK a dedicated spindle) Namenode on an independent node Secondary Namenode on an independent node These are the general recommendations and different environments might warrant different decisions. For instance, if it's just a PoC or Dev cluster where you don't really want to fret about SLAs and want to keep costs low, it might even be okay to collocate the Namenode, Zookeeper and HBase master on the same physical host. Hope that helps -Amandeep On Monday, July 2, 2012 at 4:40 AM, Michael Segel wrote: > I am not finding fault with what Andy was saying. The problem is that we tend not to use stronger language when discussing these topics. And my point wasn't just on this topic but others posts where we say 'not a good idea' yet someone still pursues the idea until there's a chorus of saying not to do something. I'm not faulting the poster because he wasn't and isn't the only one who does this... We see it all the time where someone goes down the wrong path, and is looking for a quick solution, rather than following the recommendation.
-
Re: HBASE -- Regionserver and QuorumPeer ?Jay Wilson 2012-07-02, 20:12
First, Yep I am a newbie to Hadoop/Hbase. I have read both of the
O'Reilly books (Hadoop and Hbase), so my knowledge level at this point is pure book learning and understanding the log messages is very vexing. Second, based on the recommendations of this mail-list I decided to move my HRegionservers to nodes other than where where my HQuorumpeers are. I updated my regionservers file on every node in the cluster. I ran stop-hbase.sh, stop-all.sh, and cleaned up my zookeeper files. Then I ran start-all.sh, waited, and then ran start-hbase.sh. Now my HMaster and HRegionservers terminate within seconds. Before I had them at least running for 30 minutes. The message is: 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=<NA> 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.name=Linux 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.arch=amd64 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=2.6.18-194.el5 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=hadoop 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/home/hadoop 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/home/hadoop/jscripts 2012-07-02 12:39:02,194 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=devrackA-03:2181,devrackA-05:2181,devrackA-04:2181 sessionTimeout=180000 watcher=master:60000 2012-07-02 12:39:02,205 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server devrackA-05/172.18.0.6:2181 2012-07-02 12:39:02,211 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused I tried the same sequence again (stop-hbase.sh, stop-all.sh, and cleaned up zookeeper), but I get the same result (Connection refused). Is there something else I need to do when I move a regionserver? My zookeeper working directory is /home/hbase/zookeeper. Would there be other places that I need to clean up? Thank You -- Jay On 7/2/2012 11:25 AM, Amandeep Khurana wrote: > As someone who has been developing/running/using the software for a longer period of time than the person who is asking the question, you can best serve the poser by making them aware of the trade offs and why it's a good/bad idea to do things a certain way. At the end of the day, it's their choice to make based on their requirements and constraints. > > Having said that, it'll be really nice to stop this thread from becoming more about how to answer questions rather than answering the question itself. > > Bringing the thread back to track: > > Jay, you can certainly run zookeepers with the Datanodes and Region Server processes. The issue there (as highlighted by Andy earlier) is that you will likely load up the machine (primarily due to I/O) which will cause ZK some grief. It is generally recommended to collocate in the following groups: > > Datanode + Region Servers on the same physical nodes > Zookeeper and HBase Master on the same physical nodes (make sure to give ZK a dedicated spindle) > Namenode on an independent node > Secondary Namenode on an independent node > > These are the general recommendations and different environments might warrant different decisions. For instance, if it's just a PoC or Dev cluster where you don't really want to fret about SLAs and want to keep costs low, it might even be okay to collocate the Namenode, Zookeeper and HBase master on the same physical host. > > Hope that helps > > -Amandeep > > > On Monday, July 2, 2012 at 4:40 AM, Michael Segel wrote: > >> I am not finding fault with what Andy was saying. The problem is that we tend not to use stronger language when discussing these topics. And my point wasn't just on this topic but others posts where we say 'not a good idea' yet someone still pursues the idea until there's a chorus of saying not to do something. I'm not faulting the poster because he wasn't and isn't the only one who does this... We see it all the time where someone goes down the wrong path, and is looking for a quick solution, rather than following the recommendation.
-
Re: HBASE -- Regionserver and QuorumPeer ?Suraj Varma 2012-07-02, 21:43
The error you are getting is:
> 2012-07-02 12:39:02,205 INFO org.apache.zookeeper.ClientCnxn: Opening > socket connection to server devrackA-05/172.18.0.6:2181 > 2012-07-02 12:39:02,211 WARN org.apache.zookeeper.ClientCnxn: Session > 0x0 for server null, unexpected error, closing socket connection and > attempting reconnect > java.net.ConnectException: Connection refused This means this server is not able to reach the zookeeper. Did you change your hbase-site.xml as well with the new zookeeper quorum? Do basic connectivity testing to ensure that your hosts / DNS is all in place after your relocations - checkout http://hbase.apache.org/book.html#d1952e311 and see if the dns checker tool might help. --S On Mon, Jul 2, 2012 at 1:12 PM, Jay Wilson <[EMAIL PROTECTED]> wrote: > First, Yep I am a newbie to Hadoop/Hbase. I have read both of the > O'Reilly books (Hadoop and Hbase), so my knowledge level at this point > is pure book learning and understanding the log messages is very vexing. > > Second, based on the recommendations of this mail-list I decided to move > my HRegionservers to nodes other than where where my HQuorumpeers are. > I updated my regionservers file on every node in the cluster. I ran > stop-hbase.sh, stop-all.sh, and cleaned up my zookeeper files. Then I > ran start-all.sh, waited, and then ran start-hbase.sh. Now my HMaster > and HRegionservers terminate within seconds. Before I had them at least > running for 30 minutes. The message is: > > 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client > environment:java.io.tmpdir=/tmp > 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client > environment:java.compiler=<NA> > 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client > environment:os.name=Linux > 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client > environment:os.arch=amd64 > 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client > environment:os.version=2.6.18-194.el5 > 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client > environment:user.name=hadoop > 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client > environment:user.home=/home/hadoop > 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client > environment:user.dir=/home/hadoop/jscripts > 2012-07-02 12:39:02,194 INFO org.apache.zookeeper.ZooKeeper: Initiating > client connection, > connectString=devrackA-03:2181,devrackA-05:2181,devrackA-04:2181 > sessionTimeout=180000 watcher=master:60000 > 2012-07-02 12:39:02,205 INFO org.apache.zookeeper.ClientCnxn: Opening > socket connection to server devrackA-05/172.18.0.6:2181 > 2012-07-02 12:39:02,211 WARN org.apache.zookeeper.ClientCnxn: Session > 0x0 for server null, unexpected error, closing socket connection and > attempting reconnect > java.net.ConnectException: Connection refused > > I tried the same sequence again (stop-hbase.sh, stop-all.sh, and cleaned > up zookeeper), but I get the same result (Connection refused). Is there > something else I need to do when I move a regionserver? > > My zookeeper working directory is /home/hbase/zookeeper. Would there be > other places that I need to clean up? > > > > Thank You > -- > Jay > > > > On 7/2/2012 11:25 AM, Amandeep Khurana wrote: >> As someone who has been developing/running/using the software for a longer period of time than the person who is asking the question, you can best serve the poser by making them aware of the trade offs and why it's a good/bad idea to do things a certain way. At the end of the day, it's their choice to make based on their requirements and constraints. >> >> Having said that, it'll be really nice to stop this thread from becoming more about how to answer questions rather than answering the question itself. >> >> Bringing the thread back to track: >> >> Jay, you can certainly run zookeepers with the Datanodes and Region Server processes. The issue there (as highlighted by Andy earlier) is that you will likely load up the machine (primarily due to I/O) which will cause ZK some grief. It is generally recommended to collocate in the following groups:
-
Re: HBASE -- Regionserver and QuorumPeer ?Jay Wilson 2012-07-02, 22:55
First, thank you.
I moved my HRegionservers not my HQuorumPeers. I have checked the network and everyone can talk to everyone. I can even talk to my HQuorumPeers via "nc" from the nodes that should be running my HMaster on it and my HRegionservers. [hadoop@devrackA-00 ~]$ zookeeper-check devrackA-03 imok This ZooKeeper instance is not currently serving requests This ZooKeeper instance is not currently serving requests devrackA-04 imok Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT Clients: /172.18.0.1:41582[0](queued=0,recved=1,sent=0) Latency min/avg/max: 0/0/0 Received: 5 Sent: 4 Outstanding: 0 Zxid: 0x0 Mode: follower Node count: 4 /172.18.0.1:41583[0](queued=0,recved=1,sent=0) devrackA-05 imok Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT Clients: /172.18.0.1:35517[0](queued=0,recved=1,sent=0) Latency min/avg/max: 0/0/0 Received: 5 Sent: 4 Outstanding: 0 Zxid: 0x0 Mode: follower Node count: 4 /172.18.0.1:35518[0](queued=0,recved=1,sent=0) ~~~~~~~~~~~~~~~~~~~~ [hadoop@devrackA-06 ~]$ jps 21276 Jps 20641 DataNode [hadoop@devrackA-06 ~]$ echo ruok | nc devrackA-04 2181 imok[hadoop@devrackA-06 ~]$ echo stat | nc devrackA-04 2181 Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT Clients: /172.18.0.7:37950[0](queued=0,recved=1,sent=0) Latency min/avg/max: 0/0/0 Received: 8 Sent: 7 Outstanding: 0 Zxid: 0x0 Mode: follower Node count: 4 ~~~~~~~~~~~~~~~~~~~ [hadoop@devrackB-07 ~]$ echo ruok | nc devrackA-04 2181 imok[hadoop@devrackB-07 ~]$ echo stat | nc devrackA-03 2181 This ZooKeeper instance is not currently serving requests [hadoop@devrackB-07 ~]$ echo stat | nc devrackA-05 2181 Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT Clients: /172.18.0.72:40784[0](queued=0,recved=1,sent=0) Latency min/avg/max: 0/0/0 Received: 7 Sent: 6 Outstanding: 0 Zxid: 0x0 Mode: follower Node count: 4 [hadoop@devrackB-07 ~]$ echo stat | nc devrackA-04 2181 Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT Clients: /172.18.0.72:60795[0](queued=0,recved=1,sent=0) Latency min/avg/max: 0/0/0 Received: 10 Sent: 9 Outstanding: 0 Zxid: 0x0 Mode: follower Node count: 4 [hadoop@devrackB-07 ~]$ ~~~~~~~~~~~ I know it says connection refused in the error, but are there files associated with a HRegionServer that I need to clean up? I did NOT move the HMaster or HQuorumPeers. I only moved the HRegionServers Thanks you for the help. --- Jay Wilson On 7/2/2012 2:43 PM, Suraj Varma wrote: > The error you are getting is: > >> 2012-07-02 12:39:02,205 INFO org.apache.zookeeper.ClientCnxn: Opening >> socket connection to server devrackA-05/172.18.0.6:2181 >> 2012-07-02 12:39:02,211 WARN org.apache.zookeeper.ClientCnxn: Session >> 0x0 for server null, unexpected error, closing socket connection and >> attempting reconnect >> java.net.ConnectException: Connection refused > > > This means this server is not able to reach the zookeeper. Did you > change your hbase-site.xml as well with the new zookeeper quorum? > Do basic connectivity testing to ensure that your hosts / DNS is all > in place after your relocations - checkout > http://hbase.apache.org/book.html#d1952e311 and see if the dns checker > tool might help. > --S > > > > On Mon, Jul 2, 2012 at 1:12 PM, Jay Wilson > <[EMAIL PROTECTED]> wrote: >> First, Yep I am a newbie to Hadoop/Hbase. I have read both of the >> O'Reilly books (Hadoop and Hbase), so my knowledge level at this point >> is pure book learning and understanding the log messages is very vexing. >> >> Second, based on the recommendations of this mail-list I decided to move >> my HRegionservers to nodes other than where where my HQuorumpeers are. >> I updated my regionservers file on every node in the cluster. I ran >> stop-hbase.sh, stop-all.sh, and cleaned up my zookeeper files. Then I >> ran start-all.sh, waited, and then ran start-hbase.sh. Now my HMaster >> and HRegionservers terminate within seconds. Before I had them at least
-
Re: HBASE -- Regionserver and QuorumPeer ?Suraj Varma 2012-07-02, 23:43
Ok - thanks for checking connectivity.
I presume you already have doublechecked the hbase-site.xml in your region server that points to the zookeeper and hdfs-site.xml pointed to the namenode. I once got a similar error when HBase was picking up a stray core-site.xml / hdfs-site.xml from the hdfs install or hbase-site.xml from another hbase install (perhaps a stray local install) If connectivity is all right, and you are getting connection refused, I think your region server is picking up the wrong configuration file. So - do a "locate" on the region server configuration files to see if there are others on the box. Just trying to eliminate basic setup issues ... --Suraj On Mon, Jul 2, 2012 at 3:55 PM, Jay Wilson <[EMAIL PROTECTED]> wrote: > First, thank you. > > I moved my HRegionservers not my HQuorumPeers. > > I have checked the network and everyone can talk to everyone. I can > even talk to my HQuorumPeers via "nc" from the nodes that should be > running my HMaster on it and my HRegionservers. > > [hadoop@devrackA-00 ~]$ zookeeper-check > devrackA-03 > imok > This ZooKeeper instance is not currently serving requests > This ZooKeeper instance is not currently serving requests > > > > devrackA-04 > imok > Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT > Clients: > /172.18.0.1:41582[0](queued=0,recved=1,sent=0) > > Latency min/avg/max: 0/0/0 > Received: 5 > Sent: 4 > Outstanding: 0 > Zxid: 0x0 > Mode: follower > Node count: 4 > /172.18.0.1:41583[0](queued=0,recved=1,sent=0) > > > > > devrackA-05 > imok > Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT > Clients: > /172.18.0.1:35517[0](queued=0,recved=1,sent=0) > > Latency min/avg/max: 0/0/0 > Received: 5 > Sent: 4 > Outstanding: 0 > Zxid: 0x0 > Mode: follower > Node count: 4 > /172.18.0.1:35518[0](queued=0,recved=1,sent=0) > > > ~~~~~~~~~~~~~~~~~~~~ > > > [hadoop@devrackA-06 ~]$ jps > 21276 Jps > 20641 DataNode > [hadoop@devrackA-06 ~]$ echo ruok | nc devrackA-04 2181 > imok[hadoop@devrackA-06 ~]$ echo stat | nc devrackA-04 2181 > Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT > Clients: > /172.18.0.7:37950[0](queued=0,recved=1,sent=0) > > Latency min/avg/max: 0/0/0 > Received: 8 > Sent: 7 > Outstanding: 0 > Zxid: 0x0 > Mode: follower > Node count: 4 > > > ~~~~~~~~~~~~~~~~~~~ > > > [hadoop@devrackB-07 ~]$ echo ruok | nc devrackA-04 2181 > imok[hadoop@devrackB-07 ~]$ echo stat | nc devrackA-03 2181 > This ZooKeeper instance is not currently serving requests > [hadoop@devrackB-07 ~]$ echo stat | nc devrackA-05 2181 > Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT > Clients: > /172.18.0.72:40784[0](queued=0,recved=1,sent=0) > > Latency min/avg/max: 0/0/0 > Received: 7 > Sent: 6 > Outstanding: 0 > Zxid: 0x0 > Mode: follower > Node count: 4 > [hadoop@devrackB-07 ~]$ echo stat | nc devrackA-04 2181 > Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT > Clients: > /172.18.0.72:60795[0](queued=0,recved=1,sent=0) > > Latency min/avg/max: 0/0/0 > Received: 10 > Sent: 9 > Outstanding: 0 > Zxid: 0x0 > Mode: follower > Node count: 4 > [hadoop@devrackB-07 ~]$ > > ~~~~~~~~~~~ > > I know it says connection refused in the error, but are there files > associated with a HRegionServer that I need to clean up? I did NOT move > the HMaster or HQuorumPeers. I only moved the HRegionServers > > Thanks you for the help. > > --- > Jay Wilson > > > > > > On 7/2/2012 2:43 PM, Suraj Varma wrote: >> The error you are getting is: >> >>> 2012-07-02 12:39:02,205 INFO org.apache.zookeeper.ClientCnxn: Opening >>> socket connection to server devrackA-05/172.18.0.6:2181 >>> 2012-07-02 12:39:02,211 WARN org.apache.zookeeper.ClientCnxn: Session >>> 0x0 for server null, unexpected error, closing socket connection and >>> attempting reconnect >>> java.net.ConnectException: Connection refused >> >> >> This means this server is not able to reach the zookeeper. Did you >> change your hbase-site.xml as well with the new zookeeper quorum?
-
Re: HBASE -- Regionserver and QuorumPeer ?Jay Wilson 2012-07-03, 00:51
When I do "locate hbase-site.xml", "locate hdfs-site.xml", and "locate
core-site.xml" there are 2 locations for each on the HRegionServers. All files are either in $HADOOP_HOME/conf or $HBASE_HOME/conf and there are files of the same name in "example" directories. I moved my HRegionServers back to my original nodes that also have the HQuorumPeers on them. My HMaster and HRegionServers are now running again. I suspect they will terminate after 30 minutes like they have been doing, but at least they are running. --- Jay Wilson On 7/2/2012 4:43 PM, Suraj Varma wrote: > Ok - thanks for checking connectivity. > > I presume you already have doublechecked the hbase-site.xml in your > region server that points to the zookeeper and hdfs-site.xml pointed > to the namenode. > > I once got a similar error when HBase was picking up a stray > core-site.xml / hdfs-site.xml from the hdfs install or hbase-site.xml > from another hbase install (perhaps a stray local install) > > If connectivity is all right, and you are getting connection refused, > I think your region server is picking up the wrong configuration file. > So - do a "locate" on the region server configuration files to see if > there are others on the box. > > Just trying to eliminate basic setup issues ... > --Suraj > > > On Mon, Jul 2, 2012 at 3:55 PM, Jay Wilson > <[EMAIL PROTECTED]> wrote: >> First, thank you. >> >> I moved my HRegionservers not my HQuorumPeers. >> >> I have checked the network and everyone can talk to everyone. I can >> even talk to my HQuorumPeers via "nc" from the nodes that should be >> running my HMaster on it and my HRegionservers. >> >> [hadoop@devrackA-00 ~]$ zookeeper-check >> devrackA-03 >> imok >> This ZooKeeper instance is not currently serving requests >> This ZooKeeper instance is not currently serving requests >> >> >> >> devrackA-04 >> imok >> Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT >> Clients: >> /172.18.0.1:41582[0](queued=0,recved=1,sent=0) >> >> Latency min/avg/max: 0/0/0 >> Received: 5 >> Sent: 4 >> Outstanding: 0 >> Zxid: 0x0 >> Mode: follower >> Node count: 4 >> /172.18.0.1:41583[0](queued=0,recved=1,sent=0) >> >> >> >> >> devrackA-05 >> imok >> Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT >> Clients: >> /172.18.0.1:35517[0](queued=0,recved=1,sent=0) >> >> Latency min/avg/max: 0/0/0 >> Received: 5 >> Sent: 4 >> Outstanding: 0 >> Zxid: 0x0 >> Mode: follower >> Node count: 4 >> /172.18.0.1:35518[0](queued=0,recved=1,sent=0) >> >> >> ~~~~~~~~~~~~~~~~~~~~ >> >> >> [hadoop@devrackA-06 ~]$ jps >> 21276 Jps >> 20641 DataNode >> [hadoop@devrackA-06 ~]$ echo ruok | nc devrackA-04 2181 >> imok[hadoop@devrackA-06 ~]$ echo stat | nc devrackA-04 2181 >> Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT >> Clients: >> /172.18.0.7:37950[0](queued=0,recved=1,sent=0) >> >> Latency min/avg/max: 0/0/0 >> Received: 8 >> Sent: 7 >> Outstanding: 0 >> Zxid: 0x0 >> Mode: follower >> Node count: 4 >> >> >> ~~~~~~~~~~~~~~~~~~~ >> >> >> [hadoop@devrackB-07 ~]$ echo ruok | nc devrackA-04 2181 >> imok[hadoop@devrackB-07 ~]$ echo stat | nc devrackA-03 2181 >> This ZooKeeper instance is not currently serving requests >> [hadoop@devrackB-07 ~]$ echo stat | nc devrackA-05 2181 >> Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT >> Clients: >> /172.18.0.72:40784[0](queued=0,recved=1,sent=0) >> >> Latency min/avg/max: 0/0/0 >> Received: 7 >> Sent: 6 >> Outstanding: 0 >> Zxid: 0x0 >> Mode: follower >> Node count: 4 >> [hadoop@devrackB-07 ~]$ echo stat | nc devrackA-04 2181 >> Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT >> Clients: >> /172.18.0.72:60795[0](queued=0,recved=1,sent=0) >> >> Latency min/avg/max: 0/0/0 >> Received: 10 >> Sent: 9 >> Outstanding: 0 >> Zxid: 0x0 >> Mode: follower >> Node count: 4 >> [hadoop@devrackB-07 ~]$ >> >> ~~~~~~~~~~~ >> >> I know it says connection refused in the error, but are there files >> associated with a HRegionServer that I need to clean up? I did NOT move
-
Re: HBASE -- Regionserver and QuorumPeer ?Suraj Varma 2012-07-03, 01:13
I think your devrackA-03 zookeeper is not quite "ok" - it doesn't seem
to be part of the quorum. http://zookeeper-user.578899.n2.nabble.com/ZooKeeper-JMX-Monitoring-suggestion-td6681354.html >>> [hadoop@devrackA-00 ~]$ zookeeper-check >>> devrackA-03 >>> imok >>> This ZooKeeper instance is not currently serving requests >>> This ZooKeeper instance is not currently serving requests Check it's logs to see why it is not able to respond to stat command. >>> imok[hadoop@devrackB-07 ~]$ echo stat | nc devrackA-03 2181 >>> This ZooKeeper instance is not currently serving requests --Suraj On Mon, Jul 2, 2012 at 5:51 PM, Jay Wilson <[EMAIL PROTECTED]> wrote: > When I do "locate hbase-site.xml", "locate hdfs-site.xml", and "locate > core-site.xml" there are 2 locations for each on the HRegionServers. > All files are either in $HADOOP_HOME/conf or $HBASE_HOME/conf and there > are files of the same name in "example" directories. > > I moved my HRegionServers back to my original nodes that also have the > HQuorumPeers on them. My HMaster and HRegionServers are now running > again. I suspect they will terminate after 30 minutes like they have > been doing, but at least they are running. > > --- > Jay Wilson > > On 7/2/2012 4:43 PM, Suraj Varma wrote: >> Ok - thanks for checking connectivity. >> >> I presume you already have doublechecked the hbase-site.xml in your >> region server that points to the zookeeper and hdfs-site.xml pointed >> to the namenode. >> >> I once got a similar error when HBase was picking up a stray >> core-site.xml / hdfs-site.xml from the hdfs install or hbase-site.xml >> from another hbase install (perhaps a stray local install) >> >> If connectivity is all right, and you are getting connection refused, >> I think your region server is picking up the wrong configuration file. >> So - do a "locate" on the region server configuration files to see if >> there are others on the box. >> >> Just trying to eliminate basic setup issues ... >> --Suraj >> >> >> On Mon, Jul 2, 2012 at 3:55 PM, Jay Wilson >> <[EMAIL PROTECTED]> wrote: >>> First, thank you. >>> >>> I moved my HRegionservers not my HQuorumPeers. >>> >>> I have checked the network and everyone can talk to everyone. I can >>> even talk to my HQuorumPeers via "nc" from the nodes that should be >>> running my HMaster on it and my HRegionservers. >>> >>> [hadoop@devrackA-00 ~]$ zookeeper-check >>> devrackA-03 >>> imok >>> This ZooKeeper instance is not currently serving requests >>> This ZooKeeper instance is not currently serving requests >>> >>> >>> >>> devrackA-04 >>> imok >>> Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT >>> Clients: >>> /172.18.0.1:41582[0](queued=0,recved=1,sent=0) >>> >>> Latency min/avg/max: 0/0/0 >>> Received: 5 >>> Sent: 4 >>> Outstanding: 0 >>> Zxid: 0x0 >>> Mode: follower >>> Node count: 4 >>> /172.18.0.1:41583[0](queued=0,recved=1,sent=0) >>> >>> >>> >>> >>> devrackA-05 >>> imok >>> Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT >>> Clients: >>> /172.18.0.1:35517[0](queued=0,recved=1,sent=0) >>> >>> Latency min/avg/max: 0/0/0 >>> Received: 5 >>> Sent: 4 >>> Outstanding: 0 >>> Zxid: 0x0 >>> Mode: follower >>> Node count: 4 >>> /172.18.0.1:35518[0](queued=0,recved=1,sent=0) >>> >>> >>> ~~~~~~~~~~~~~~~~~~~~ >>> >>> >>> [hadoop@devrackA-06 ~]$ jps >>> 21276 Jps >>> 20641 DataNode >>> [hadoop@devrackA-06 ~]$ echo ruok | nc devrackA-04 2181 >>> imok[hadoop@devrackA-06 ~]$ echo stat | nc devrackA-04 2181 >>> Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT >>> Clients: >>> /172.18.0.7:37950[0](queued=0,recved=1,sent=0) >>> >>> Latency min/avg/max: 0/0/0 >>> Received: 8 >>> Sent: 7 >>> Outstanding: 0 >>> Zxid: 0x0 >>> Mode: follower >>> Node count: 4 >>> >>> >>> ~~~~~~~~~~~~~~~~~~~ >>> >>> >>> [hadoop@devrackB-07 ~]$ echo ruok | nc devrackA-04 2181 >>> imok[hadoop@devrackB-07 ~]$ echo stat | nc devrackA-03 2181 >>> This ZooKeeper instance is not currently serving requests |