Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Distributed execution for UNION ALL


Copy link to this message
-
Re: Distributed execution for UNION ALL
Hi Alexander
      Are you have a single node execution issue only for this particular query that involves Union all or is it same across all hive queries.

Regards
Bejoy KS

Sent from handheld, please excuse typos.

-----Original Message-----
From: Alexander Goryunov <[EMAIL PROTECTED]>
Date: Fri, 4 May 2012 17:23:24
To: <[EMAIL PROTECTED]>; Bejoy Ks<[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
Subject: Re: Distributed execution for UNION ALL

Hi Bejoy KS,

Thanks for your answer.

from job.xml:
*mapred.job.tracker* =full.namenode.hostname:8021
On Fri, May 4, 2012 at 5:07 PM, Bejoy Ks <[EMAIL PROTECTED]> wrote:

> Hi Alexander
>       Since the tasks are just executing on local node. Looks like hive
> map reduce jobs are running locally. What is the value for *
> mapred.job.tracker *in your job.xml or from mapred-site.xml?
>
> Regards
> Bejoy KS
>
>   ------------------------------
> *From:* Alexander Goryunov <[EMAIL PROTECTED]>
> *To:* [EMAIL PROTECTED]
> *Sent:* Friday, May 4, 2012 6:22 PM
> *Subject:* Distributed execution for UNION ALL
>
> Hello,
>
> I have a query like
>
> SELECT * FROM (
> SELECT 1, concat(1_timestamp, ', ', 2_account_id )
> FROM table_1 WHERE 2_account_id = 1132576 LIMIT 1000000000
> UNION ALL
> SELECT 2, concat(1_timestamp, ', ', 2_account_id )
> FROM table_2 WHERE 2_account_id = 1132576 LIMIT 1000000000
> UNION ALL
> SELECT 3, concat(1_timestamp, ', ', 2_account_id )
> FROM table_3 WHERE 2_account_id = 1132576 LIMIT 1000000000
> UNION ALL
> .... // some hundred tables here
> ) res;
>
> Parallel jobs set to true in hive config and it creates mapreduce max map
> tasks on the requested node.
>
> What should be done to distribute that jobs over the all cluster nodes?
>
> Thanks.
>
>
>

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB