Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> HiveServer can not handle concurrent requests from more than one client?


Copy link to this message
-
Re: HiveServer can not handle concurrent requests from more than one client?
Thanks a lot.

>  It's possible to live with this limitation if you're ok with sometimes
fetching other people's result sets instead of your own.
I hadn't thought about that, only about the states of variables. That
consequence isn't nice. It won't be a security issue really in my context
but that can be very inconvenient.

> Yes, this is true regardless of the configuration. Ranjiht's statement is
incorrect.
Ok, so the only true solution, as proposed in the jira is to 'serialize'
the calls with a kind of proxy like a queue. But that would go against the
multi users goals and relatively low latency that Hive could provide.

> These two configuration properties are actually completely orthogonal to
the HiveServer multi-client issue
I thought so but wasn't sure. Thank you for the full explanation and making
clear what is the difference.

 > I recommend taking a look at the Beeswax web interface for Hive. More
details (including screenshots) are available here:
https://ccp.cloudera.com/display/CDHDOC/Beeswax

I know about that but I am afraid that it would mean changing the
distribution which is currently used which is not a small thing. But I will
consider that solution more seriously. I take it from your answer that the
backend is different? I could not find much information about it and wasn't
sure if the same issues applied to Beeswax.

Thanks a lot, again.

Bertrand

On Tue, Aug 28, 2012 at 12:04 AM, Carl Steinbach <[EMAIL PROTECTED]> wrote:

> Hi Bertrand,
>
> According to the proposal for HiveServer2, the current hive server
>> provides no insurance about "session state in between calls".
>> If that was all, it is something that can be lived with. It only means
>> that for a JDBC client, all requests should be conceived as isolated.
>>
>
> In the HiveServer Thrift API Execute() and Fetch() are two separate calls
> and require two separate RPCs. In between these calls HiveServer has to
> maintain session state so that when the Fetch() call is made it knows which
> result set to look at. The current HiveServer Thrift API assumes that
> Thrift will consistently map the same physical connection to the same
> Thrift worker thread, and consequently it stores the session state in a
> thread local variable. Unfortunately, this assumption is false. It's
> possible to live with this limitation if you're ok with sometimes fetching
> other people's result sets instead of your own.
>
>
>> The page of the Hive Server (1) says "HiveServer can not handle
>> concurrent requests from more than one client."
>> According to the jira, one may run into issues when multiples users are
>> running it. Is that true regardless of the configuration?
>> It should not be interpreted as "query will be executed one after the
>> other", like Ranjiht said?
>>
>
> Yes, this is true regardless of the configuration. Ranjiht's statement is
> incorrect.
>
>
>> Eg what would be the impact of hive.exec.parallel or
>> hive.support.concurrency?
>>
>
> These two configuration properties are actually completely orthogonal to
> the HiveServer multi-client issue, though it's hard to know that since the
> configuration property names were very poorly chosen. hive.exec.parallel
> controls whether or not the the MR jobs in the query plan DAG are executed
> in parallel on the cluster (https://issues.apache.org/jira/browse/HIVE-549).
> hive.support.concurrency controls whether or not Hive supports
> coarse-grained locks on tables and partitions (see
> https://cwiki.apache.org/confluence/display/Hive/Locking).
>
>
>> What would be the recommended way for providing a hive access to multiple
>> users to a production environnement which is thightly fire walled? Ssh is
>> not a viable solution in my context and the hive web interface does not
>> seem mature enough.
>>
>
> I recommend taking a look at the Beeswax web interface for Hive. More
> details (including screenshots) are available here:
> https://ccp.cloudera.com/display/CDHDOC/Beeswax
>
> Thanks.
>
> Carl
>
>
--
Bertrand Dechoux
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB