Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> How to run multiple Hive queries in parallel


Copy link to this message
-
Re: How to run multiple Hive queries in parallel
Bejoy is right. I just want to say explicitly that the scheduler
configuration is something which is orthogonal to the use of Hive. (ie same
problem with Pig or standard MapReduce jobs).

Regards

Bertrand

PS : There is also the capacity scheduler.

On Mon, Oct 22, 2012 at 2:18 PM, Bejoy KS <[EMAIL PROTECTED]> wrote:

> **
> Hi
>
> Is your hive queries in waiting mode even though there are task slots
> available on your cluster?
>
> If task slots are getting exhausted and you need parallelism here, then
> you may need to look at some approaches of using fair scheduler and
> different user accounts for each user so that each user gets his fair share
> of task slots.
>
>
> Regards
> Bejoy KS
>
> Sent from handheld, please excuse typos.
> ------------------------------
> *From: * Chunky Gupta <[EMAIL PROTECTED]>
> *Date: *Mon, 22 Oct 2012 17:27:45 +0530
> *To: *<[EMAIL PROTECTED]>
> *ReplyTo: * [EMAIL PROTECTED]
> *Subject: *How to run multiple Hive queries in parallel
>
> Hi,
>
> I have one name node machine and under which there are 4 slaves machines
> to run the job.
>
> The way users run queries is
> - They ssh into the name node machine
> - They initiate hive and submit their queries
>
> Currently multiple users log in with the same credentials and submit
> queries
>
> Whenever 2 or more users try to run queries at a same time from different
> hive console , it runs only one query at a time and when that query is
> finished then only next query starts executing and so on.
>
> In this scenario if there is a large query which is submitted earlier then
> all the other queries have to wait for that query to complete.
>
> I want to run multiple query at the same time. Is there any way or any
> configuration parameter to do the same ?
>
> PS: The data is in S3 and running HIVE on AWS EMR infrastructure in
> interactive mode.
>
> Thank You,
> Chunky.
>
>
--
Bertrand Dechoux
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB