Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Multi Table Inserts produces multiple jobs


Copy link to this message
-
Re: Multi Table Inserts produces multiple jobs
Hi Cristi,

The source_table is scanned only once in a multi-insert scenario, whereas if u have 2 queries it will be scanned twice.

If you do an 'explain extended' on the query you would know the flow of data.

You could find related info @ http://www.slideshare.net/ragho/hive-user-meeting-august-2009-facebook - Slides 51-53.

-Thiruvel

On Aug 24, 2010, at 9:18 PM, Cristi Cioriia wrote:

> Hi guys,
>
> I would like to use the Multi Insert feature of HIVE so that I could
> have fewer map-reduce jobs than running separate queries.
>
> I have some HIVE queries that use the Multi Insert feature as below:
>
> FROM source_table
> INSERT OVERWRITE TABLE tablename1
> SELECT field1, field2 ...fieldN
> GROUP BY field1, field2
> INSERT OVERWRITE TABLE tablename2
> SELECT field1,  field3 ... fieldK
> GROUP BY field1, field3
>
> I was hoping that by using this feature only 1 Map-Reduce job will be
> created, but what I found out when running the query is that 2  jobs are
> created, just as if I would have ran 2 separate queries:
>
> FROM source_table
> INSERT OVERWRITE TABLE tablename1
> SELECT field1, field2 ...fieldN
> GROUP BY field1, field2
>
> FROM source_table
> INSERT OVERWRITE TABLE tablename1
> SELECT field1,  field3 ... fieldK
> GROUP BY field1, field3
>
> Is there any way that I can get only 1 MR job with the multi insert
> syntax?
>
> Thanks,
> Cristi
>
>
>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB