Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Please help on providing correct answers


Copy link to this message
-
Re: Please help on providing correct answers
Ramasubramanian Narayanan... 2012-11-07, 18:54
Hi,

Have given my explanation for choosing and why I am saying given answer is
wrong...

You are running a job that will process a single InputSplit on a cluster
which has no other jobs
currently running. Each node has an equal number of open Map slots. On
which node will Hadoop
first attempt to run the Map task?
A. The node with the most memory
B. The node with the lowest system load
C. The node on which this InputSplit is stored
D. The node with the most free local disk space

My Answer            : C [Mapper will run on the data nodes where it has
the data. So it will run the map task on the node in which the InputSplit
is stored.]
Answer Given in site : A [I Hope the Map task will go and check the nodes
which has the most memory]
*******************************************************************************
What is a Writable?
A. Writable is an interface that all keys and values in MapReduce must
implement. Classes implementing this interface must implement methods
forserializingand deserializing themselves.
B. Writable is an abstract class that all keys and values in MapReduce must
extend. Classes extending this abstract base class must implementmethods
for serializing and deserializingthemselves
C. Writable is an interface that all keys, but not values, in MapReduce
must implement. Classes implementing this interface mustimplementmethods
for serializing and deserializing themselves.
D. Writable is an abstract class that all keys, but not values, in
MapReduce must extend. Classes extending this abstract base class must
implementmethods for serializing and deserializing themselves.

My Answer            : A [Writable is an interface]
Answer Given in site : B [Writable is not abstract class]
******************************************************************************

You write a MapReduce job to process 100 files in HDFS. Your MapReducc
algorithm uses
TextInputFormat and the IdentityReducer: the mapper applies a regular
expression over input
values and emits key-value pairs with the key consisting of the matching
text, and the value
containing the filename and byte offset. Determine the difference between
setting the number of
reducers to zero.
A. There is no differenceinoutput between the two settings.
B. With zero reducers, no reducer runs and the job throws an exception.
With one reducer,
instances of matching patterns are stored in a single file on HDFS.
C. With zero reducers, all instances of matching patterns are gathered
together in one file on
HDFS. With one reducer, instances ofmatching patternsstored in multiple
files on HDFS.
D. With zero reducers, instances of matching patterns are stored in
multiple files on HDFS. With
one reducer, all instances of matching patterns aregathered together in one
file on HDFS.

My Answer            : D [With No reducers all the output of Mappers will
be directly written to HDFS. So mutiple files will be created]
Answer Given in site : C [If you have one reducer then you will get one
output file only not many]

*******************************************************************************

During the standard sort and shuffle phase of MapReduce, keys and values
are passed to
reducers. Which of the following is true?
A. Keys are presented to a reducerin sorted order; values foragiven key are
not sorted.
B. Keys are presented to a reducer in soiled order; values for a given key
are sorted in ascending
order.
C. Keys are presented to a reducer in random order; values for a given key
are not sorted.
D. Keys are presented to a reducer in random order; values for a given key
are sorted in
ascending order.

My Answer            : A [For Reducer, Keys will be passed on Sorted order
not Value. To get the value in sorted order we need to use secondary sort]
Answer Given in site : D [For Reducer, Keys will be passsed only in sorted
order not in random order]

*******************************************************************************

Which statement best describes the data path of intermediate key-value
pairs (i.e., output of the
mappers)?
A. Intermediate key-value pairs are written to HDFS. Reducers read the
intermediate data from
HDFS.
B. Intermediate key-value pairs are written to HDFS. Reducers copy the
intermediate data to the
local disks of the machines runningthe reduce tasks.
C. Intermediate key-value pairs are written to the local disks of the
machines running the map
tasks, and then copied to the machinerunning thereduce tasks.
D. Intermediate key-value pairs are written to the local disks of the
machines running the map
tasks, and are then copied to HDFS. Reducers read theintermediate data from
HDFS.

My Answer            : C [Intermediate key-value pairs are written in local
disk and transferred to network for reducer. Once the job is completed the
intermediate data will be deleted on the data node]
Answer Given in site : B [Intermediate key-values will not be written to
HDFS and reducer will not read from HDFS]

*******************************************************************************

You are developing a combiner that takes as input Text keys, IntWritable
values, and emits Text
keys, Intwritable values. Which interface should your class implement?
A. Mapper <Text, IntWritable, Text, IntWritable>
B. Reducer <Text, Text, IntWritable, IntWritable>
C. Reducer <Text, IntWritable, Text, IntWritable>
D. Combiner <Text, IntWritable, Text, IntWritable>
E. Combiner <Text, Text, IntWritable, IntWritable>

My Answer            : D [For developing combiner we need to use the
combiner method]
Answer Given in site : C [For developing combiner we need to use the
combiner method only not Reducer method]

*******************************************************************************

What happens in a MapReduce job when you set the number of reducers to one?
A. A single reducer gathers and processes all the output from all the
mappers. The output is
written in as many separate files as there are mappers.
B. A single reducer gathers and pr