|
|
-
Re: Multi Table Inserts produces multiple jobsThiruvel Thirumoolan 2010-08-24, 16:50
Hi Cristi,
The source_table is scanned only once in a multi-insert scenario, whereas if u have 2 queries it will be scanned twice. If you do an 'explain extended' on the query you would know the flow of data. You could find related info @ http://www.slideshare.net/ragho/hive-user-meeting-august-2009-facebook - Slides 51-53. -Thiruvel On Aug 24, 2010, at 9:18 PM, Cristi Cioriia wrote: > Hi guys, > > I would like to use the Multi Insert feature of HIVE so that I could > have fewer map-reduce jobs than running separate queries. > > I have some HIVE queries that use the Multi Insert feature as below: > > FROM source_table > INSERT OVERWRITE TABLE tablename1 > SELECT field1, field2 ...fieldN > GROUP BY field1, field2 > INSERT OVERWRITE TABLE tablename2 > SELECT field1, field3 ... fieldK > GROUP BY field1, field3 > > I was hoping that by using this feature only 1 Map-Reduce job will be > created, but what I found out when running the query is that 2 jobs are > created, just as if I would have ran 2 separate queries: > > FROM source_table > INSERT OVERWRITE TABLE tablename1 > SELECT field1, field2 ...fieldN > GROUP BY field1, field2 > > FROM source_table > INSERT OVERWRITE TABLE tablename1 > SELECT field1, field3 ... fieldK > GROUP BY field1, field3 > > Is there any way that I can get only 1 MR job with the multi insert > syntax? > > Thanks, > Cristi > > > > > |