Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # dev >> is it possible to ignore http Mapoutput get by feed mapoutput file and index file diirectly to reducer?


Copy link to this message
-
Re: is it possible to ignore http Mapoutput get by feed mapoutput file and index file diirectly to reducer?
After search the hadoop maillist again, I found this link which trying to
optimize hadoop based on Lustre using Hardlink instead of http(
http://search-hadoop.com/m/JkHSa17oHp12 ).
 Any other suggestion ?

Thanks all

yours,
Ling Kun
On Thu, Feb 28, 2013 at 4:57 PM, Ling Kun <[EMAIL PROTECTED]> wrote:

> Dear Arun C Murthy, Pavan Kulkarni and all.
>      Hello!
>      I am currently working on optimize Hadoop cluster based on Lustre FS.
> According to the TeraSort Benchmark, it seems the remote mapoutput copy
> takes a great part of the total runtime.
>
>
>    After search , I saw your discussion half a years ago (
> http://search-hadoop.com/m/jj3y46KUwC1 ).
>
>      I am writing to wonder whether  we  can make reducer directly read
> his part of each mapout file based on index file, and merge them together,
> instead of making each map task generate output for each reduce task.
>
>     In this way, it seems that not too much inode is needed.
>
>
> @Pavan Kulkarni: no email wa sent by you after Sep. 2012. Could you please
> kindly share some experience on how to optimize such a kind of  FileSystem
> like lustre?
>
>   Anyone have similar work experience?
>
>
>   Any comment and reply is welcome and appreciate!
>
> yours,
> Ling Kun.
> *
> *
> --
> http://www.lingcc.com
>

--
http://www.lingcc.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB