Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Re: Input Split vs Task vs attempt vs computation


+
Sai Sai 2013-09-27, 05:12
Copy link to this message
-
Re: Input Split vs Task vs attempt vs computation
Inline

Best Regards,
Sonal
Nube Technologies <http://www.nubetech.co>

<http://in.linkedin.com/in/sonalgoyal>
On Fri, Sep 27, 2013 at 10:42 AM, Sai Sai <[EMAIL PROTECTED]> wrote:

> Hi
> I have a few questions i am trying to understand:
>
> 1. Is each input split same as a record, (a rec can be a single line or
> multiple lines).
>

An InputSplit is a chunk of input that is handled by a map task. It will
generally contain multiple records. The RecordReader provides the key
values to the map task. Check
http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapreduce/InputSplit.html

>
> 2. Is each Task a collection of few computations or attempts.
>
> For ex: if i have a small file with 5 lines.
> By default there will be 1 line on which each map computation is performed.
> So totally 5 computations r done on 1 node.
>
> This means JT will spawn 1 JVM for 1 Tasktracker on a node
> and another JVM for map task which will instantiate 5 map objects 1 for
> each line.
>
> i am not sure what you mean by 5 map objects. But yes, the mapper will be
invoked 5 times, once for each line.
> The MT JVM is called the task which will have 5 attempts for  each line.
> This means attempt is same as computation.
>
> Please let me know if anything is incorrect.
> Thanks
> Sai
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB