Manish Malhotra 2012-11-08, 07:28
Ashutosh Chauhan 2012-11-08, 17:39
Manish Malhotra 2012-11-11, 03:39
-Re: Hive Meta Server (Thrift Server) Failover / Redundancy / Load Balancing
Ashutosh Chauhan 2012-11-13, 18:26
You can use LB. Trouble you might have while deploying just LB without
failover is that when metastore server is actually going down all your
active connections will be dropped as well. But, since most of the rpc
calls to metastore are expected to complete fairly quickly, depending on
your workload you might be ok with this.
For secure metastore, patch is ready to use. You may want to try it out.
Hope it helps,
On Sat, Nov 10, 2012 at 7:39 PM, Manish Malhotra <
[EMAIL PROTECTED]> wrote:
> Thanks Ashutosh,
> For quick reply.
> 1. For non-secure MetaServer: I'm wondering if I add LoadBalancer like
> HAProxy in between then we don't need to handle the failover at the Thrift
> client side.
> So, If I use LB in between Thrift Client and MetaServer, then it should be
> good to use ?
> May be I'm missing something, but I'll check out the code and see what is
> the status of the patch and what additional work is required.
> 2. For Secure MetaServer: I need to dig further into code, and then will
> ask more questions if required.
> I believe storing token into DB patch is available for review but not
> available as the one which is ready to use. Or I can try out that patch
> when using the secured one?
> Again thanks your help !!
> On Thu, Nov 8, 2012 at 9:39 AM, Ashutosh Chauhan <[EMAIL PROTECTED]>wrote:
>> Hi Manish,
>> Your understanding is mostly correct, though there is one additional bit.
>> MetastoreClient in current incarnation don't automatically reconnect in
>> case connection gets dropped for a connected session. As a result, it won't
>> failover active connections. New connections would be fine. Fortunately,
>> though there is a work in progress for this on :
>> https://issues.apache.org/jira/browse/HIVE-3400 In case you want to help
>> out you should help there.
>> For secure case, as you pointed out you additionally need ZooKeeper to
>> store security tokens. So, you need to bring up a ZK cluster. But, if you
>> think dedicating 3 nodes for ZK for metastore is an overhead then you would
>> need https://issues.apache.org/jira/browse/HIVE-3255 With that patch,
>> tokens are stored in same backend db, so there would be no need to bring up
>> ZK cluster.
>> Hopefully, both of these patches gets in for 0.10 release.
>> On Wed, Nov 7, 2012 at 11:28 PM, Manish Malhotra <
>> [EMAIL PROTECTED]> wrote:
>>> I need to build a failover/LB solution for Hive Services.
>>> MySQL DB is fine, and can work out.
>>> But for Hive Metastore Service, can I simply put the Load Balancer like
>>> HA Proxy etc. in between the client and achieve this.
>>> Thrift Servers and default stateless, not sure about hive one.
>>> I red very few comments on this problem.
>>> Similar approach blogged at :
>>> Very Imp from HCatalog mailing thread:
>>> http://mail-archives.apache.org/mod_mbox/incubator-hcatalog-user/201109.mbox/%[EMAIL PROTECTED]%3E
>>> As per this mailing thread, if security is used in HIVE thrift meta
>>> server then need to do more modification in the server as it maintain the
>>> token of user for that session. (user connection).
>>> Please help me to move forward on this problem and please verify if my
>>> understanding is correct or not on the above 2 blogs / mail.
>>> Is there any initial work done under HCatalog or Hive, which I can look
>>> into and extend / patch.