Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Do I have to sort?


Copy link to this message
-
Re: Do I have to sort?
John,

that sounds very interesting, and I may implement such a workflow, but can
I write back to HDFS in the mapper? In the reducer it is a standard
context.write(), but it is a different context.

Thank you,
Mark

On Mon, Jun 18, 2012 at 9:24 AM, John Armstrong <[EMAIL PROTECTED]> wrote:

> On 06/18/2012 10:19 AM, Mark Kerzner wrote:
>
>> If only reducers could be told to start their work on the first
>> maps that they see, my processing would begin to show results much
>> earlier,
>> before all the mappers are done.
>>
>
> The sort/shuffle phase isn't just about ordering the keys, it's about
> collecting all the results of the map phase that share a key together for
> the reducers to work on.  If your reducer can operate on mapper outputs
> independently of each other, then it sounds like it's really another mapper
> and should be either factored into the mapper or rewritten as a mapper on
> its own and both mappers thrown into the ChainMapper (if you're using the
> older API).
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB