|
V.Senthil Kumar
2011-05-02, 22:41
Matthew Rathbone
2011-05-03, 14:46
V.Senthil Kumar
2011-05-03, 17:40
Matthew Rathbone
2011-05-03, 17:59
V.Senthil Kumar
2011-05-03, 18:01
Paul Ingles
2011-05-03, 18:15
Matthew Rathbone
2011-05-03, 18:18
V.Senthil Kumar
2011-05-03, 18:22
Paul Ingles
2011-05-03, 19:01
Paul Ingles
2011-05-04, 11:48
Marcos Ortiz
2011-05-04, 13:12
V.Senthil Kumar
2011-05-04, 18:05
|
-
HIVE Server multiple instancesV.Senthil Kumar 2011-05-02, 22:41
Hello,
I have one instance of HIVE JDBC server running on port 10000. Can I run another instance on different port ? Would it cause a concurrency issue on the underlying data warehouse files ? Please clarify. Thanks, V.Senthil Kumar
-
Re: HIVE Server multiple instancesMatthew Rathbone 2011-05-03, 14:46
Why would you want to run two? I think it is multithreaded, so you can query it from two different connections
-- Matthew Rathbone Foursquare | Software Engineer | Server Engineering Team [EMAIL PROTECTED] | @rathboma | 4sq On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote: Hello, > > I have one instance of HIVE JDBC server running on port 10000. Can I run another > instance on different port ? Would it cause a concurrency issue on the > underlying data warehouse files ? Please clarify. > > Thanks, > V.Senthil Kumar >
-
Re: HIVE Server multiple instancesV.Senthil Kumar 2011-05-03, 17:40
Thanks Matthew. The wiki page http://wiki.apache.org/hadoop/Hive/HiveServer says
its single threaded. I have a queue of queries which gets added dynamically all the time. By the time I run 1 query using 1 JDBC connection, the queue gets added more queries and builds up a backlog. So, I was that's why I was wondering whether I can run two or more instances to avoid having a big backlog in queue. ----- Original Message ---- From: Matthew Rathbone <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Tue, May 3, 2011 7:46:49 AM Subject: Re: HIVE Server multiple instances Why would you want to run two? I think it is multithreaded, so you can query it from two different connections -- Matthew Rathbone Foursquare | Software Engineer | Server Engineering Team [EMAIL PROTECTED] | @rathboma | 4sq On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote: Hello, > > I have one instance of HIVE JDBC server running on port 10000. Can I run >another > > instance on different port ? Would it cause a concurrency issue on the > underlying data warehouse files ? Please clarify. > > Thanks, > V.Senthil Kumar >
-
Re: HIVE Server multiple instancesMatthew Rathbone 2011-05-03, 17:59
Even if it is single threaded it certainly seems to support multiple connections.
We run 5 workers all connected at the same time executing a different query each ( with a different connection per worker). Hope that helps Matthew On Tuesday, May 3, 2011 at 1:40 PM, V.Senthil Kumar wrote: Thanks Matthew. The wiki page http://wiki.apache.org/hadoop/Hive/HiveServer says > its single threaded. I have a queue of queries which gets added dynamically all > the time. By the time I run 1 query using 1 JDBC connection, the queue gets > added more queries and builds up a backlog. So, I was that's why I was wondering > whether I can run two or more instances to avoid having a big backlog in queue. > > > > ----- Original Message ---- > From: Matthew Rathbone <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Sent: Tue, May 3, 2011 7:46:49 AM > Subject: Re: HIVE Server multiple instances > > Why would you want to run two? I think it is multithreaded, so you can query it > from two different connections > > -- > Matthew Rathbone > Foursquare | Software Engineer | Server Engineering Team > [EMAIL PROTECTED] | @rathboma | 4sq > > On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote: > Hello, > > > > I have one instance of HIVE JDBC server running on port 10000. Can I run > > another > > > > instance on different port ? Would it cause a concurrency issue on the > > underlying data warehouse files ? Please clarify. > > > > Thanks, > > V.Senthil Kumar >
-
Re: HIVE Server multiple instancesV.Senthil Kumar 2011-05-03, 18:01
Thanks. That really helps and answers my question.
----- Original Message ---- From: Matthew Rathbone <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Tue, May 3, 2011 10:59:37 AM Subject: Re: HIVE Server multiple instances Even if it is single threaded it certainly seems to support multiple connections. We run 5 workers all connected at the same time executing a different query each ( with a different connection per worker). Hope that helps Matthew On Tuesday, May 3, 2011 at 1:40 PM, V.Senthil Kumar wrote: Thanks Matthew. The wiki page http://wiki.apache.org/hadoop/Hive/HiveServer says > its single threaded. I have a queue of queries which gets added dynamically all > > the time. By the time I run 1 query using 1 JDBC connection, the queue gets > added more queries and builds up a backlog. So, I was that's why I was >wondering > > whether I can run two or more instances to avoid having a big backlog in queue. > > > > ----- Original Message ---- > From: Matthew Rathbone <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Sent: Tue, May 3, 2011 7:46:49 AM > Subject: Re: HIVE Server multiple instances > > Why would you want to run two? I think it is multithreaded, so you can query it > > from two different connections > > -- > Matthew Rathbone > Foursquare | Software Engineer | Server Engineering Team > [EMAIL PROTECTED] | @rathboma | 4sq > > On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote: > Hello, > > > > I have one instance of HIVE JDBC server running on port 10000. Can I run > > another > > > > instance on different port ? Would it cause a concurrency issue on the > > underlying data warehouse files ? Please clarify. > > > > Thanks, > > V.Senthil Kumar >
-
Re: HIVE Server multiple instancesPaul Ingles 2011-05-03, 18:15
HiveServer does seem to support multiple connections but I think it still has thread-safety problems (https://issues.apache.org/jira/browse/HIVE-80).
We've (www.forward.co.uk) certainly had instability problems with the thrift server in the past and now run 5 or so instances behind the HAProxy load-balancer (http://haproxy.1wt.eu/). Since we did that it's been significantly better. I think the JDBC server still operates using thrift to connect to the HiveServer so I would expect it to have similar problems (but I may have got that wrong :) On 3 May 2011, at 18:59, Matthew Rathbone wrote: > Even if it is single threaded it certainly seems to support multiple connections. > > We run 5 workers all connected at the same time executing a different query each ( with a different connection per worker). > > Hope that helps > > Matthew > On Tuesday, May 3, 2011 at 1:40 PM, V.Senthil Kumar wrote: > Thanks Matthew. The wiki page http://wiki.apache.org/hadoop/Hive/HiveServer says >> its single threaded. I have a queue of queries which gets added dynamically all >> the time. By the time I run 1 query using 1 JDBC connection, the queue gets >> added more queries and builds up a backlog. So, I was that's why I was wondering >> whether I can run two or more instances to avoid having a big backlog in queue. >> >> >> >> ----- Original Message ---- >> From: Matthew Rathbone <[EMAIL PROTECTED]> >> To: [EMAIL PROTECTED] >> Sent: Tue, May 3, 2011 7:46:49 AM >> Subject: Re: HIVE Server multiple instances >> >> Why would you want to run two? I think it is multithreaded, so you can query it >> from two different connections >> >> -- >> Matthew Rathbone >> Foursquare | Software Engineer | Server Engineering Team >> [EMAIL PROTECTED] | @rathboma | 4sq >> >> On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote: >> Hello, >>> >>> I have one instance of HIVE JDBC server running on port 10000. Can I run >>> another >>> >>> instance on different port ? Would it cause a concurrency issue on the >>> underlying data warehouse files ? Please clarify. >>> >>> Thanks, >>> V.Senthil Kumar >> >
-
Re: HIVE Server multiple instancesMatthew Rathbone 2011-05-03, 18:18
Hey Paul,
I'd be very interested in reading about your hadoop/hive setup, do you have a blog post or anything describing this setup, or some of the issues you've have with hive? -- Matthew Rathbone Foursquare | Software Engineer | Server Engineering Team [EMAIL PROTECTED] | @rathboma | 4sq On Tuesday, May 3, 2011 at 2:15 PM, Paul Ingles wrote: HiveServer does seem to support multiple connections but I think it still has thread-safety problems (https://issues.apache.org/jira/browse/HIVE-80). > > We've (www.forward.co.uk) certainly had instability problems with the thrift server in the past and now run 5 or so instances behind the HAProxy load-balancer (http://haproxy.1wt.eu/). Since we did that it's been significantly better. > > I think the JDBC server still operates using thrift to connect to the HiveServer so I would expect it to have similar problems (but I may have got that wrong :) > > > On 3 May 2011, at 18:59, Matthew Rathbone wrote: > > > Even if it is single threaded it certainly seems to support multiple connections. > > > > We run 5 workers all connected at the same time executing a different query each ( with a different connection per worker). > > > > Hope that helps > > > > Matthew > > On Tuesday, May 3, 2011 at 1:40 PM, V.Senthil Kumar wrote: > > Thanks Matthew. The wiki page http://wiki.apache.org/hadoop/Hive/HiveServer says > > > its single threaded. I have a queue of queries which gets added dynamically all > > > the time. By the time I run 1 query using 1 JDBC connection, the queue gets > > > added more queries and builds up a backlog. So, I was that's why I was wondering > > > whether I can run two or more instances to avoid having a big backlog in queue. > > > > > > > > > > > > ----- Original Message ---- > > > From: Matthew Rathbone <[EMAIL PROTECTED]> > > > To: [EMAIL PROTECTED] > > > Sent: Tue, May 3, 2011 7:46:49 AM > > > Subject: Re: HIVE Server multiple instances > > > > > > Why would you want to run two? I think it is multithreaded, so you can query it > > > from two different connections > > > > > > -- > > > Matthew Rathbone > > > Foursquare | Software Engineer | Server Engineering Team > > > [EMAIL PROTECTED] | @rathboma | 4sq > > > > > > On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote: > > > Hello, > > > > > > > > I have one instance of HIVE JDBC server running on port 10000. Can I run > > > > another > > > > > > > > instance on different port ? Would it cause a concurrency issue on the > > > > underlying data warehouse files ? Please clarify. > > > > > > > > Thanks, > > > > V.Senthil Kumar >
-
Re: HIVE Server multiple instancesV.Senthil Kumar 2011-05-03, 18:22
Thanks Paul. That is really useful information.
----- Original Message ---- From: Matthew Rathbone <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Tue, May 3, 2011 11:18:17 AM Subject: Re: HIVE Server multiple instances Hey Paul, I'd be very interested in reading about your hadoop/hive setup, do you have a blog post or anything describing this setup, or some of the issues you've have with hive? -- Matthew Rathbone Foursquare | Software Engineer | Server Engineering Team [EMAIL PROTECTED] | @rathboma | 4sq On Tuesday, May 3, 2011 at 2:15 PM, Paul Ingles wrote: HiveServer does seem to support multiple connections but I think it still has thread-safety problems (https://issues.apache.org/jira/browse/HIVE-80). > > We've (www.forward.co.uk) certainly had instability problems with the thrift >server in the past and now run 5 or so instances behind the HAProxy >load-balancer (http://haproxy.1wt.eu/). Since we did that it's been >significantly better. > > > I think the JDBC server still operates using thrift to connect to the >HiveServer so I would expect it to have similar problems (but I may have got >that wrong :) > > > On 3 May 2011, at 18:59, Matthew Rathbone wrote: > > > Even if it is single threaded it certainly seems to support multiple >connections. > > > > > We run 5 workers all connected at the same time executing a different query >each ( with a different connection per worker). > > > > Hope that helps > > > > Matthew > > On Tuesday, May 3, 2011 at 1:40 PM, V.Senthil Kumar wrote: > > Thanks Matthew. The wiki page http://wiki.apache.org/hadoop/Hive/HiveServer >says > > > > its single threaded. I have a queue of queries which gets added dynamically >all > > > > the time. By the time I run 1 query using 1 JDBC connection, the queue gets > > > > added more queries and builds up a backlog. So, I was that's why I was >wondering > > > > whether I can run two or more instances to avoid having a big backlog in >queue. > > > > > > > > > > > > ----- Original Message ---- > > > From: Matthew Rathbone <[EMAIL PROTECTED]> > > > To: [EMAIL PROTECTED] > > > Sent: Tue, May 3, 2011 7:46:49 AM > > > Subject: Re: HIVE Server multiple instances > > > > > > Why would you want to run two? I think it is multithreaded, so you can >query it > > > > from two different connections > > > > > > -- > > > Matthew Rathbone > > > Foursquare | Software Engineer | Server Engineering Team > > > [EMAIL PROTECTED] | @rathboma | 4sq > > > > > > On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote: > > > Hello, > > > > > > > > I have one instance of HIVE JDBC server running on port 10000. Can I run > > > > another > > > > > > > > instance on different port ? Would it cause a concurrency issue on the > > > > underlying data warehouse files ? Please clarify. > > > > > > > > Thanks, > > > > V.Senthil Kumar >
-
Re: HIVE Server multiple instancesPaul Ingles 2011-05-03, 19:01
Nothing specifically about our Hive setup although some of us at Forward have blogged bits and pieces about Hive + Hadoop and have a few Hadoop/Hive related libs on our GitHub account: https://github.com/forward.
I've blogged a few bits (http://www.oobaloo.co.uk/) as has one of my colleagues (http://blog.fingertap.org/post/1255463384/hive-thrift-client). Another colleague also presented a little about our setup during a Hadoop meetup last summer (http://skillsmatter.com/podcast/home/hadoop-in-context-1591). The numbers Andy mentioned will be a little out of date but it does include some screenshots of a few of the surrounding apps we built that connect to Hive and Hadoop (including a web based Hive query tool + work queue). I had a quick search through the mailing lists when we had connection problems but I think most of it was discussed/resolved during a chat I had with Shevek from Karmasphere at a London pub following a Hadoop meetup :) If you're interested, I've posted a gist (https://gist.github.com/953926) that contains our HAProxy config; clients connect to 10000 and are balanced between :10001 and :10005 on 2 servers (so actually 10 backend servers). Be happy to talk more about our experience- feel free to ping me an email off list if you'd like. On 3 May 2011, at 19:18, Matthew Rathbone wrote: > Hey Paul, > > I'd be very interested in reading about your hadoop/hive setup, do you have a blog post or anything describing this setup, or some of the issues you've have with hive? > > -- > Matthew Rathbone > Foursquare | Software Engineer | Server Engineering Team > [EMAIL PROTECTED] | @rathboma | 4sq > > On Tuesday, May 3, 2011 at 2:15 PM, Paul Ingles wrote: > HiveServer does seem to support multiple connections but I think it still has thread-safety problems (https://issues.apache.org/jira/browse/HIVE-80). >> >> We've (www.forward.co.uk) certainly had instability problems with the thrift server in the past and now run 5 or so instances behind the HAProxy load-balancer (http://haproxy.1wt.eu/). Since we did that it's been significantly better. >> >> I think the JDBC server still operates using thrift to connect to the HiveServer so I would expect it to have similar problems (but I may have got that wrong :) >> >> >> On 3 May 2011, at 18:59, Matthew Rathbone wrote: >> >>> Even if it is single threaded it certainly seems to support multiple connections. >>> >>> We run 5 workers all connected at the same time executing a different query each ( with a different connection per worker). >>> >>> Hope that helps >>> >>> Matthew >>> On Tuesday, May 3, 2011 at 1:40 PM, V.Senthil Kumar wrote: >>> Thanks Matthew. The wiki page http://wiki.apache.org/hadoop/Hive/HiveServer says >>>> its single threaded. I have a queue of queries which gets added dynamically all >>>> the time. By the time I run 1 query using 1 JDBC connection, the queue gets >>>> added more queries and builds up a backlog. So, I was that's why I was wondering >>>> whether I can run two or more instances to avoid having a big backlog in queue. >>>> >>>> >>>> >>>> ----- Original Message ---- >>>> From: Matthew Rathbone <[EMAIL PROTECTED]> >>>> To: [EMAIL PROTECTED] >>>> Sent: Tue, May 3, 2011 7:46:49 AM >>>> Subject: Re: HIVE Server multiple instances >>>> >>>> Why would you want to run two? I think it is multithreaded, so you can query it >>>> from two different connections >>>> >>>> -- >>>> Matthew Rathbone >>>> Foursquare | Software Engineer | Server Engineering Team >>>> [EMAIL PROTECTED] | @rathboma | 4sq >>>> >>>> On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote: >>>> Hello, >>>>> >>>>> I have one instance of HIVE JDBC server running on port 10000. Can I run >>>>> another >>>>> >>>>> instance on different port ? Would it cause a concurrency issue on the >>>>> underlying data warehouse files ? Please clarify. >>>>> >>>>> Thanks, >>>>> V.Senthil Kumar >> >
-
Re: HIVE Server multiple instancesPaul Ingles 2011-05-04, 11:48
For future reference I've posted a little more about our setup here:
http://oobaloo.co.uk/multiple-connections-with-hive On Tue, May 3, 2011 at 8:01 PM, Paul Ingles <[EMAIL PROTECTED]> wrote: > Nothing specifically about our Hive setup although some of us at Forward > have blogged bits and pieces about Hive + Hadoop and have a few Hadoop/Hive > related libs on our GitHub account: https://github.com/forward. > > I've blogged a few bits (http://www.oobaloo.co.uk/) as has one of my > colleagues (http://blog.fingertap.org/post/1255463384/hive-thrift-client). > > Another colleague also presented a little about our setup during a Hadoop > meetup last summer ( > http://skillsmatter.com/podcast/home/hadoop-in-context-1591). The numbers > Andy mentioned will be a little out of date but it does include some > screenshots of a few of the surrounding apps we built that connect to Hive > and Hadoop (including a web based Hive query tool + work queue). > > I had a quick search through the mailing lists when we had connection > problems but I think most of it was discussed/resolved during a chat I had > with Shevek from Karmasphere at a London pub following a Hadoop meetup :) > > If you're interested, I've posted a gist (https://gist.github.com/953926) > that contains our HAProxy config; clients connect to 10000 and are balanced > between :10001 and :10005 on 2 servers (so actually 10 backend servers). > > Be happy to talk more about our experience- feel free to ping me an email > off list if you'd like. > > > On 3 May 2011, at 19:18, Matthew Rathbone wrote: > > > Hey Paul, > > > > I'd be very interested in reading about your hadoop/hive setup, do you > have a blog post or anything describing this setup, or some of the issues > you've have with hive? > > > > -- > > Matthew Rathbone > > Foursquare | Software Engineer | Server Engineering Team > > [EMAIL PROTECTED] | @rathboma | 4sq > > > > On Tuesday, May 3, 2011 at 2:15 PM, Paul Ingles wrote: > > HiveServer does seem to support multiple connections but I think it still > has thread-safety problems (https://issues.apache.org/jira/browse/HIVE-80 > ). > >> > >> We've (www.forward.co.uk) certainly had instability problems with the > thrift server in the past and now run 5 or so instances behind the HAProxy > load-balancer (http://haproxy.1wt.eu/). Since we did that it's been > significantly better. > >> > >> I think the JDBC server still operates using thrift to connect to the > HiveServer so I would expect it to have similar problems (but I may have got > that wrong :) > >> > >> > >> On 3 May 2011, at 18:59, Matthew Rathbone wrote: > >> > >>> Even if it is single threaded it certainly seems to support multiple > connections. > >>> > >>> We run 5 workers all connected at the same time executing a different > query each ( with a different connection per worker). > >>> > >>> Hope that helps > >>> > >>> Matthew > >>> On Tuesday, May 3, 2011 at 1:40 PM, V.Senthil Kumar wrote: > >>> Thanks Matthew. The wiki page > http://wiki.apache.org/hadoop/Hive/HiveServer says > >>>> its single threaded. I have a queue of queries which gets added > dynamically all > >>>> the time. By the time I run 1 query using 1 JDBC connection, the queue > gets > >>>> added more queries and builds up a backlog. So, I was that's why I was > wondering > >>>> whether I can run two or more instances to avoid having a big backlog > in queue. > >>>> > >>>> > >>>> > >>>> ----- Original Message ---- > >>>> From: Matthew Rathbone <[EMAIL PROTECTED]> > >>>> To: [EMAIL PROTECTED] > >>>> Sent: Tue, May 3, 2011 7:46:49 AM > >>>> Subject: Re: HIVE Server multiple instances > >>>> > >>>> Why would you want to run two? I think it is multithreaded, so you can > query it > >>>> from two different connections > >>>> > >>>> -- > >>>> Matthew Rathbone > >>>> Foursquare | Software Engineer | Server Engineering Team > >>>> [EMAIL PROTECTED] | @rathboma | 4sq > >>>> > >>>> On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote
-
Re: HIVE Server multiple instancesMarcos Ortiz 2011-05-04, 13:12
El 5/4/2011 7:48 AM, Paul Ingles escribi�:
> For future reference I've posted a little more about our setup here: > http://oobaloo.co.uk/multiple-connections-with-hive > > > On Tue, May 3, 2011 at 8:01 PM, Paul Ingles <[EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]>> wrote: > > Nothing specifically about our Hive setup although some of us at > Forward have blogged bits and pieces about Hive + Hadoop and have > a few Hadoop/Hive related libs on our GitHub account: > https://github.com/forward. > > I've blogged a few bits (http://www.oobaloo.co.uk/) as has one of > my colleagues > (http://blog.fingertap.org/post/1255463384/hive-thrift-client). > > Another colleague also presented a little about our setup during a > Hadoop meetup last summer > (http://skillsmatter.com/podcast/home/hadoop-in-context-1591). The > numbers Andy mentioned will be a little out of date but it does > include some screenshots of a few of the surrounding apps we built > that connect to Hive and Hadoop (including a web based Hive query > tool + work queue). > > I had a quick search through the mailing lists when we had > connection problems but I think most of it was discussed/resolved > during a chat I had with Shevek from Karmasphere at a London pub > following a Hadoop meetup :) > > If you're interested, I've posted a gist > (https://gist.github.com/953926) that contains our HAProxy config; > clients connect to 10000 and are balanced between :10001 and > :10005 on 2 servers (so actually 10 backend servers). > > Be happy to talk more about our experience- feel free to ping me > an email off list if you'd like. > > > On 3 May 2011, at 19:18, Matthew Rathbone wrote: > > > Hey Paul, > > > > I'd be very interested in reading about your hadoop/hive setup, > do you have a blog post or anything describing this setup, or some > of the issues you've have with hive? > > > > -- > > Matthew Rathbone > > Foursquare | Software Engineer | Server Engineering Team > > [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> | > @rathboma | 4sq > > > > On Tuesday, May 3, 2011 at 2:15 PM, Paul Ingles wrote: > > HiveServer does seem to support multiple connections but I think > it still has thread-safety problems > (https://issues.apache.org/jira/browse/HIVE-80). > >> > >> We've (www.forward.co.uk <http://www.forward.co.uk>) certainly > had instability problems with the thrift server in the past and > now run 5 or so instances behind the HAProxy load-balancer > (http://haproxy.1wt.eu/). Since we did that it's been > significantly better. > >> > >> I think the JDBC server still operates using thrift to connect > to the HiveServer so I would expect it to have similar problems > (but I may have got that wrong :) > >> > >> > >> On 3 May 2011, at 18:59, Matthew Rathbone wrote: > >> > >>> Even if it is single threaded it certainly seems to support > multiple connections. > >>> > >>> We run 5 workers all connected at the same time executing a > different query each ( with a different connection per worker). > >>> > >>> Hope that helps > >>> > >>> Matthew > >>> On Tuesday, May 3, 2011 at 1:40 PM, V.Senthil Kumar wrote: > >>> Thanks Matthew. The wiki page > http://wiki.apache.org/hadoop/Hive/HiveServer says > >>>> its single threaded. I have a queue of queries which gets > added dynamically all > >>>> the time. By the time I run 1 query using 1 JDBC connection, > the queue gets > >>>> added more queries and builds up a backlog. So, I was that's > why I was wondering > >>>> whether I can run two or more instances to avoid having a big > backlog in queue. > >>>> > >>>> > >>>> > >>>> ----- Original Message ---- > >>>> From: Matthew Rathbone <[EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]>> Wow, good piece of information. Thanks for share it Marcos Lu�s Ort�z Valmaseda Software Engineer (Large-Scaled Distributed Systems) University of Information Sciences, La Habana, Cuba Linux User # 418229 http://about.me/marcosortiz
-
Re: HIVE Server multiple instancesV.Senthil Kumar 2011-05-04, 18:05
This is great info. Thanks a lot for sharing :)
________________________________ From: Paul Ingles <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Wed, May 4, 2011 4:48:20 AM Subject: Re: HIVE Server multiple instances For future reference I've posted a little more about our setup here: http://oobaloo.co.uk/multiple-connections-with-hive On Tue, May 3, 2011 at 8:01 PM, Paul Ingles <[EMAIL PROTECTED]> wrote: Nothing specifically about our Hive setup although some of us at Forward have blogged bits and pieces about Hive + Hadoop and have a few Hadoop/Hive related libs on our GitHub account: https://github.com/forward. > >I've blogged a few bits (http://www.oobaloo.co.uk/) as has one of my colleagues >(http://blog.fingertap.org/post/1255463384/hive-thrift-client). > >Another colleague also presented a little about our setup during a Hadoop meetup >last summer (http://skillsmatter.com/podcast/home/hadoop-in-context-1591). The >numbers Andy mentioned will be a little out of date but it does include some >screenshots of a few of the surrounding apps we built that connect to Hive and >Hadoop (including a web based Hive query tool + work queue). > >I had a quick search through the mailing lists when we had connection problems >but I think most of it was discussed/resolved during a chat I had with Shevek >from Karmasphere at a London pub following a Hadoop meetup :) > >If you're interested, I've posted a gist (https://gist.github.com/953926) that >contains our HAProxy config; clients connect to 10000 and are balanced between >:10001 and :10005 on 2 servers (so actually 10 backend servers). > >Be happy to talk more about our experience- feel free to ping me an email off >list if you'd like. > > > >On 3 May 2011, at 19:18, Matthew Rathbone wrote: > >> Hey Paul, >> >> I'd be very interested in reading about your hadoop/hive setup, do you have a >>blog post or anything describing this setup, or some of the issues you've have >>with hive? >> >> -- >> Matthew Rathbone >> Foursquare | Software Engineer | Server Engineering Team >> [EMAIL PROTECTED] | @rathboma | 4sq >> >> On Tuesday, May 3, 2011 at 2:15 PM, Paul Ingles wrote: >> HiveServer does seem to support multiple connections but I think it still has >>thread-safety problems (https://issues.apache.org/jira/browse/HIVE-80). >>> >>> We've (www.forward.co.uk) certainly had instability problems with the thrift >>>server in the past and now run 5 or so instances behind the HAProxy >>>load-balancer (http://haproxy.1wt.eu/). Since we did that it's been >>>significantly better. >>> >>> I think the JDBC server still operates using thrift to connect to the >>>HiveServer so I would expect it to have similar problems (but I may have got >>>that wrong :) >>> >>> >>> On 3 May 2011, at 18:59, Matthew Rathbone wrote: >>> >>>> Even if it is single threaded it certainly seems to support multiple >>>>connections. >>>> >>>> We run 5 workers all connected at the same time executing a different query >>>>each ( with a different connection per worker). >>>> >>>> Hope that helps >>>> >>>> Matthew >>>> On Tuesday, May 3, 2011 at 1:40 PM, V.Senthil Kumar wrote: >>>> Thanks Matthew. The wiki page http://wiki.apache.org/hadoop/Hive/HiveServer >>>>says >>>>> its single threaded. I have a queue of queries which gets added dynamically >>>>all >>>>> the time. By the time I run 1 query using 1 JDBC connection, the queue gets >>>>> added more queries and builds up a backlog. So, I was that's why I was >>>>>wondering >>>>> whether I can run two or more instances to avoid having a big backlog in >>>>queue. >>>>> >>>>> >>>>> >>>>> ----- Original Message ---- >>>>> From: Matthew Rathbone <[EMAIL PROTECTED]> >>>>> To: [EMAIL PROTECTED] >>>>> Sent: Tue, May 3, 2011 7:46:49 AM >>>>> Subject: Re: HIVE Server multiple instances >>>>> >>>>> Why would you want to run two? I think it is multithreaded, so you can query >>>>it >>>>> from two different connections >>>>> >>>>> -- > |