Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Mapper basic question


Copy link to this message
-
Re: Mapper basic question
Thanks All!
 On 11 Jul 2012 19:07, "Bejoy KS" <[EMAIL PROTECTED]> wrote:

> **
> Hi Manoj
>
> Block size is in hdfs storage level where as split size is the amount of
> data processed by each mapper while running a map reduce job(One split is
> the data processed by one mapper). One or more hdfs blocks can contribute a
> split. Splits are determined by the InputFormat as well as the min and max
> split size properties.
>
> As Arun mentioned use CombineFileInputFormat and adjust the min and max
> split size properties to control/limit the number of mappers.
>
> Regards
> Bejoy KS
>
> Sent from handheld, please excuse typos.
> ------------------------------
> *From: * Manoj Babu <[EMAIL PROTECTED]>
> *Date: *Wed, 11 Jul 2012 18:17:41 +0530
> *To: *<[EMAIL PROTECTED]>
> *ReplyTo: * [EMAIL PROTECTED]
> *Subject: *Re: Mapper basic question
>
> Hi  Tariq \Arun,
>
> The no of blocks(splits) = *total no of file size/hdfs block size *
> replicate value*
> The no of splits is again nothing but the blocks here.
>
> Other than increasing the block size(input splits) is it possible to limit
> that no of mappers?
>
>
> Cheers!
> Manoj.
>
>
>
> On Wed, Jul 11, 2012 at 6:06 PM, Arun C Murthy <[EMAIL PROTECTED]>wrote:
>
>> Take a look at CombineFileInputFormat - this will create 'meta splits'
>> which include multiple small spilts, thus reducing #maps which are run.
>>
>> Arun
>>
>> On Jul 11, 2012, at 5:29 AM, Manoj Babu wrote:
>>
>> Hi,
>>
>> The no of mappers is depends on the no of blocks. Is it possible to limit
>> the no of mappers size without increasing the HDFS block size?
>>
>> Thanks in advance.
>>
>> Cheers!
>> Manoj.
>>
>>
>>  --
>> Arun C. Murthy
>> Hortonworks Inc.
>> http://hortonworks.com/
>>
>>
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB