Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Map side aggregations


Copy link to this message
-
Re: Map side aggregations
Hi Ranjith,

I haven't checked the code (so this might not be true), but I think that
the map side aggregation stuff uses it's own hash map within the map phase
to do the aggregation, instead of using a combiner, so you wouldn't expect
to see any combine input records. Have a look for parameters
like hive.groupby.mapaggr.checkinterval, and the associated documentation
will explain how it all works.

Cheers,

Phil.

On 23 May 2012 02:44, Ranjith <[EMAIL PROTECTED]> wrote:

> Thanks Matt. I am not performing a join so does that matter? What does
> this local task do?
>
> Thanks,
> Ranjith
>
> On May 22, 2012, at 8:17 PM, "Tucker, Matt" <[EMAIL PROTECTED]>
> wrote:
>
> Try setting hive.auto.convert.join to true.  The CLI will have a local
> task before it starts a map-reduce job on the cluster.
>
> Matt
>
>
>
> On May 22, 2012, at 8:43 PM, "Raghunath, Ranjith" <
> [EMAIL PROTECTED]> wrote:
>
>  I have the parameter hive.map.aggr set to true. However, when I look at
> the counters associated with the map tasks I notice the following “Combine
> input records 0”. I am interpreting this as a failure to perform the map
> side aggregation. Is that accurate? Is this option not working in hive
> 0.7.1?
>
> Thanks,
> Ranjith
>
>
>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB