Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Mappers getting killed


Copy link to this message
-
Re: Mappers getting killed
Hi,

On Thu, Oct 27, 2011 at 3:22 AM, Arko Provo Mukherjee
<[EMAIL PROTECTED]> wrote:
> Hi,
>
> I have a situation where I have to read a large file into every mapper.
>
> Since its a large HDFS file that is needed to work on each input to the
> mapper, it is taking a lot of time to read the data into the memory from
> HDFS.
>
> Thus the system is killing all my Mappers with the following message:
>
> 11/10/26 22:54:52 INFO mapred.JobClient: Task Id :
> attempt_201106271322_12504_m_000000_0, Status : FAILED
> Task attempt_201106271322_12504_m_000000_0 failed to report status for 601
> seconds. Killing!
>
> The cluster is not entirely owned by me and hence I cannot change
> the mapred.task.timeout so as to be able to read the entire file.
> Any suggestions?
> Also, is there a way such that a Mapper instance reads the file once for all
> the inputs that it receives.
> Currently, since the file reading code is in the map method, I guess its
> reading the entire file for each and every input leading to a lot of
> overhead.
The file should be read in, in the configure() (old api) or setup()
(new api) method.

Brock
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB