|
|
+
Bertrand Dechoux 2012-08-27, 16:00
+
Raghunath, Ranjith 2012-08-27, 16:03
+
Carl Steinbach 2012-08-27, 19:15
+
Bertrand Dechoux 2012-08-27, 21:24
-
Re: HiveServer can not handle concurrent requests from more than one client?Carl Steinbach 2012-08-27, 22:04
Hi Bertrand,
According to the proposal for HiveServer2, the current hive server provides > no insurance about "session state in between calls". > If that was all, it is something that can be lived with. It only means > that for a JDBC client, all requests should be conceived as isolated. > In the HiveServer Thrift API Execute() and Fetch() are two separate calls and require two separate RPCs. In between these calls HiveServer has to maintain session state so that when the Fetch() call is made it knows which result set to look at. The current HiveServer Thrift API assumes that Thrift will consistently map the same physical connection to the same Thrift worker thread, and consequently it stores the session state in a thread local variable. Unfortunately, this assumption is false. It's possible to live with this limitation if you're ok with sometimes fetching other people's result sets instead of your own. > The page of the Hive Server (1) says "HiveServer can not handle > concurrent requests from more than one client." > According to the jira, one may run into issues when multiples users are > running it. Is that true regardless of the configuration? > It should not be interpreted as "query will be executed one after the > other", like Ranjiht said? > Yes, this is true regardless of the configuration. Ranjiht's statement is incorrect. > Eg what would be the impact of hive.exec.parallel or > hive.support.concurrency? > These two configuration properties are actually completely orthogonal to the HiveServer multi-client issue, though it's hard to know that since the configuration property names were very poorly chosen. hive.exec.parallel controls whether or not the the MR jobs in the query plan DAG are executed in parallel on the cluster (https://issues.apache.org/jira/browse/HIVE-549). hive.support.concurrency controls whether or not Hive supports coarse-grained locks on tables and partitions (see https://cwiki.apache.org/confluence/display/Hive/Locking). > What would be the recommended way for providing a hive access to multiple > users to a production environnement which is thightly fire walled? Ssh is > not a viable solution in my context and the hive web interface does not > seem mature enough. > I recommend taking a look at the Beeswax web interface for Hive. More details (including screenshots) are available here: https://ccp.cloudera.com/display/CDHDOC/Beeswax Thanks. Carl +
Bertrand Dechoux 2012-08-27, 22:27
+
Ranjith 2012-08-28, 01:41
|