Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Quick Question: LineSplit or BlockSplit


Copy link to this message
-
Re: Quick Question: LineSplit or BlockSplit
Option (1) isn't the way that things normally work.  Besides, mappers are
called many times for each construction of a mapper.

On Mon, Feb 7, 2011 at 3:38 PM, maha <[EMAIL PROTECTED]> wrote:

> Hi,
>
>  I would appreciate it if you could give me your thoughts if there is
> affect on efficiency if:
>
>  1) Mappers were per line in a document
>
>  or
>
>  2) Mappers were per block of lines in a document.
>
>
>  I know the obvious difference I can see is that (1) has more mappers. Does
> that mean (1) will be slower because of scheduling time ?
>
> Thank you,
> Maha
>