|
|
Raghunath, Ranjith 2012-05-23, 00:43
I have the parameter hive.map.aggr set to true. However, when I look at the counters associated with the map tasks I notice the following "Combine input records 0". I am interpreting this as a failure to perform the map side aggregation. Is that accurate? Is this option not working in hive 0.7.1?
Thanks, Ranjith
-
Re: Map side aggregations
Tucker, Matt 2012-05-23, 01:17
Try setting hive.auto.convert.join to true. The CLI will have a local task before it starts a map-reduce job on the cluster.
Matt
On May 22, 2012, at 8:43 PM, "Raghunath, Ranjith" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
I have the parameter hive.map.aggr set to true. However, when I look at the counters associated with the map tasks I notice the following “Combine input records 0”. I am interpreting this as a failure to perform the map side aggregation. Is that accurate? Is this option not working in hive 0.7.1?
Thanks, Ranjith
-
Re: Map side aggregations
Ranjith 2012-05-23, 01:44
Thanks Matt. I am not performing a join so does that matter? What does this local task do?
Thanks, Ranjith
On May 22, 2012, at 8:17 PM, "Tucker, Matt" <[EMAIL PROTECTED]> wrote:
> Try setting hive.auto.convert.join to true. The CLI will have a local task before it starts a map-reduce job on the cluster. > > Matt > > > > On May 22, 2012, at 8:43 PM, "Raghunath, Ranjith" <[EMAIL PROTECTED]> wrote: > >> I have the parameter hive.map.aggr set to true. However, when I look at the counters associated with the map tasks I notice the following “Combine input records 0”. I am interpreting this as a failure to perform the map side aggregation. Is that accurate? Is this option not working in hive 0.7.1? >> >> Thanks, >> Ranjith >> >> >>
-
Re: Map side aggregations
Philip Tromans 2012-05-23, 09:15
Hi Ranjith,
I haven't checked the code (so this might not be true), but I think that the map side aggregation stuff uses it's own hash map within the map phase to do the aggregation, instead of using a combiner, so you wouldn't expect to see any combine input records. Have a look for parameters like hive.groupby.mapaggr.checkinterval, and the associated documentation will explain how it all works.
Cheers,
Phil.
On 23 May 2012 02:44, Ranjith <[EMAIL PROTECTED]> wrote:
> Thanks Matt. I am not performing a join so does that matter? What does > this local task do? > > Thanks, > Ranjith > > On May 22, 2012, at 8:17 PM, "Tucker, Matt" <[EMAIL PROTECTED]> > wrote: > > Try setting hive.auto.convert.join to true. The CLI will have a local > task before it starts a map-reduce job on the cluster. > > Matt > > > > On May 22, 2012, at 8:43 PM, "Raghunath, Ranjith" < > [EMAIL PROTECTED]> wrote: > > I have the parameter hive.map.aggr set to true. However, when I look at > the counters associated with the map tasks I notice the following “Combine > input records 0”. I am interpreting this as a failure to perform the map > side aggregation. Is that accurate? Is this option not working in hive > 0.7.1? > > Thanks, > Ranjith > > > > >
-
Re: Map side aggregations
Ranjith 2012-05-24, 00:51
Thanks philip.
Thanks, Ranjith
On May 23, 2012, at 4:15 AM, Philip Tromans <[EMAIL PROTECTED]> wrote:
> Hi Ranjith, > > I haven't checked the code (so this might not be true), but I think that the map side aggregation stuff uses it's own hash map within the map phase to do the aggregation, instead of using a combiner, so you wouldn't expect to see any combine input records. Have a look for parameters like hive.groupby.mapaggr.checkinterval, and the associated documentation will explain how it all works. > > Cheers, > > Phil. > > On 23 May 2012 02:44, Ranjith <[EMAIL PROTECTED]> wrote: > Thanks Matt. I am not performing a join so does that matter? What does this local task do? > > Thanks, > Ranjith > > On May 22, 2012, at 8:17 PM, "Tucker, Matt" <[EMAIL PROTECTED]> wrote: > >> Try setting hive.auto.convert.join to true. The CLI will have a local task before it starts a map-reduce job on the cluster. >> >> Matt >> >> >> >> On May 22, 2012, at 8:43 PM, "Raghunath, Ranjith" <[EMAIL PROTECTED]> wrote: >> >>> I have the parameter hive.map.aggr set to true. However, when I look at the counters associated with the map tasks I notice the following “Combine input records 0”. I am interpreting this as a failure to perform the map side aggregation. Is that accurate? Is this option not working in hive 0.7.1? >>> >>> Thanks, >>> Ranjith >>> >>> >>> >
|
|