Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Re: Input Split vs Task vs attempt vs computation


Copy link to this message
-
Re: Input Split vs Task vs attempt vs computation
Hi
I have a few questions i am trying to understand:

1. Is each input split same as a record, (a rec can be a single line or multiple lines).

2. Is each Task a collection of few computations or attempts.

For ex: if i have a small file with 5 lines.

By default there will be 1 line on which each map computation is performed.
So totally 5 computations r done on 1 node.

This means JT will spawn 1 JVM for 1 Tasktracker on a node
and another JVM for map task which will instantiate 5 map objects 1 for each line.

The MT JVM is called the task which will have 5 attempts for  each line.
This means attempt is same as computation.

Please let me know if anything is incorrect.
Thanks
Sai
+
Sonal Goyal 2013-09-27, 08:31
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB