Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # dev - is it possible to ignore http Mapoutput get by feed mapoutput file and index file diirectly to reducer?


+
Ling Kun 2013-02-28, 08:57
Copy link to this message
-
Re: is it possible to ignore http Mapoutput get by feed mapoutput file and index file diirectly to reducer?
Ling Kun 2013-02-28, 09:46
After search the hadoop maillist again, I found this link which trying to
optimize hadoop based on Lustre using Hardlink instead of http(
http://search-hadoop.com/m/JkHSa17oHp12 ).
 Any other suggestion ?

Thanks all

yours,
Ling Kun
On Thu, Feb 28, 2013 at 4:57 PM, Ling Kun <[EMAIL PROTECTED]> wrote:

> Dear Arun C Murthy, Pavan Kulkarni and all.
>      Hello!
>      I am currently working on optimize Hadoop cluster based on Lustre FS.
> According to the TeraSort Benchmark, it seems the remote mapoutput copy
> takes a great part of the total runtime.
>
>
>    After search , I saw your discussion half a years ago (
> http://search-hadoop.com/m/jj3y46KUwC1 ).
>
>      I am writing to wonder whether  we  can make reducer directly read
> his part of each mapout file based on index file, and merge them together,
> instead of making each map task generate output for each reduce task.
>
>     In this way, it seems that not too much inode is needed.
>
>
> @Pavan Kulkarni: no email wa sent by you after Sep. 2012. Could you please
> kindly share some experience on how to optimize such a kind of  FileSystem
> like lustre?
>
>   Anyone have similar work experience?
>
>
>   Any comment and reply is welcome and appreciate!
>
> yours,
> Ling Kun.
> *
> *
> --
> http://www.lingcc.com
>

--
http://www.lingcc.com