Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> fundamental doubt


Copy link to this message
-
Re: fundamental doubt
Hello Jamal,

     For efficient processing all the values associated with the same key
get sorted and go to same reducer. As a result the reducer gets a key and a
list of values as its input. To me your assumption seems correct.

Regards,
    Mohammad Tariq

On Thu, Nov 22, 2012 at 1:20 AM, jamal sasha <[EMAIL PROTECTED]> wrote:

> Hi..
> I guess i am asking alot of fundamental questions but i thank you guys for
> taking out time to explain my doubts.
> So i am able to write map reduce jobs but here is my mydoubt
> As of now i am writing mappers which emit key and a value
> This key value is then captured at reducer end and then i process the key
> and value there.
> Let's say i want to calculate the average...
> Key1 value1
> Key2 value 2
> Key 1 value 3
>
> So the output is something like
> Key1 average of value  1 and value 3
> Key2 average 2 = value 2
>
> Right now in reducer i have to create a dictionary with key as original
> keys and value is a list.
> Data = defaultdict(list) == // python usrr
> But i thought that
> Mapper takes in the key value pairs and outputs key: ( v1,v2....)and
> Reducer takes in this key and list of values and returns
> Key , new value..
>
> So why is the input of reducer the simple output of mapper and not the
> list of all the values to a particular key or did i  understood something.
> Am i making any sense ??
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB