Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> HiveServer can not handle concurrent requests from more than one client?


Copy link to this message
-
Re: HiveServer can not handle concurrent requests from more than one client?
Hi Bertrand,

According to the proposal for HiveServer2, the current hive server provides
> no insurance about "session state in between calls".
> If that was all, it is something that can be lived with. It only means
> that for a JDBC client, all requests should be conceived as isolated.
>

In the HiveServer Thrift API Execute() and Fetch() are two separate calls
and require two separate RPCs. In between these calls HiveServer has to
maintain session state so that when the Fetch() call is made it knows which
result set to look at. The current HiveServer Thrift API assumes that
Thrift will consistently map the same physical connection to the same
Thrift worker thread, and consequently it stores the session state in a
thread local variable. Unfortunately, this assumption is false. It's
possible to live with this limitation if you're ok with sometimes fetching
other people's result sets instead of your own.
> The page of the Hive Server (1) says "HiveServer can not handle
> concurrent requests from more than one client."
> According to the jira, one may run into issues when multiples users are
> running it. Is that true regardless of the configuration?
> It should not be interpreted as "query will be executed one after the
> other", like Ranjiht said?
>

Yes, this is true regardless of the configuration. Ranjiht's statement is
incorrect.
> Eg what would be the impact of hive.exec.parallel or
> hive.support.concurrency?
>

These two configuration properties are actually completely orthogonal to
the HiveServer multi-client issue, though it's hard to know that since the
configuration property names were very poorly chosen. hive.exec.parallel
controls whether or not the the MR jobs in the query plan DAG are executed
in parallel on the cluster (https://issues.apache.org/jira/browse/HIVE-549).
hive.support.concurrency controls whether or not Hive supports
coarse-grained locks on tables and partitions (see
https://cwiki.apache.org/confluence/display/Hive/Locking).
> What would be the recommended way for providing a hive access to multiple
> users to a production environnement which is thightly fire walled? Ssh is
> not a viable solution in my context and the hive web interface does not
> seem mature enough.
>

I recommend taking a look at the Beeswax web interface for Hive. More
details (including screenshots) are available here:
https://ccp.cloudera.com/display/CDHDOC/Beeswax

Thanks.

Carl