Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Quick Question: LineSplit or BlockSplit


Copy link to this message
-
Re: Quick Question: LineSplit or BlockSplit
Option (1) isn't the way that things normally work.  Besides, mappers are
called many times for each construction of a mapper.

On Mon, Feb 7, 2011 at 3:38 PM, maha <[EMAIL PROTECTED]> wrote:

> Hi,
>
>  I would appreciate it if you could give me your thoughts if there is
> affect on efficiency if:
>
>  1) Mappers were per line in a document
>
>  or
>
>  2) Mappers were per block of lines in a document.
>
>
>  I know the obvious difference I can see is that (1) has more mappers. Does
> that mean (1) will be slower because of scheduling time ?
>
> Thank you,
> Maha
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB