Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> fundamental doubt


+
jamal sasha 2012-11-21, 19:50
Copy link to this message
-
Re: fundamental doubt
Hi Jamal

It is performed at a frame work level map emits key value pairs and the framework collects and groups all the values corresponding to a key from all the map tasks. Now the reducer takes the input as a key and a collection of values only. The reduce method signature defines it.
Regards
Bejoy KS

Sent from handheld, please excuse typos.

-----Original Message-----
From: jamal sasha <[EMAIL PROTECTED]>
Date: Wed, 21 Nov 2012 14:50:51
To: [EMAIL PROTECTED]<[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
Subject: fundamental doubt

Hi..
I guess i am asking alot of fundamental questions but i thank you guys for
taking out time to explain my doubts.
So i am able to write map reduce jobs but here is my mydoubt
As of now i am writing mappers which emit key and a value
This key value is then captured at reducer end and then i process the key
and value there.
Let's say i want to calculate the average...
Key1 value1
Key2 value 2
Key 1 value 3

So the output is something like
Key1 average of value  1 and value 3
Key2 average 2 = value 2

Right now in reducer i have to create a dictionary with key as original
keys and value is a list.
Data = defaultdict(list) == // python usrr
But i thought that
Mapper takes in the key value pairs and outputs key: ( v1,v2....)and
Reducer takes in this key and list of values and returns
Key , new value..

So why is the input of reducer the simple output of mapper and not the list
of all the values to a particular key or did i  understood something.
Am i making any sense ??

+
jamal sasha 2012-11-21, 20:24
+
Mohammad Tariq 2012-11-21, 19:58
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB