|
|
-
Please help on providing correct answers
Ramasubramanian Narayanan... 2012-11-07, 17:21
Hi,
I came across the following question in some sites and the answer that they provided seems to be wrong according to me... I might be wrong... Can some one help on confirming the right answers for these 11 questions pls.. appreciate the explanation if you could able to provide...
******************************************************************************* You are running a job that will process a single InputSplit on a cluster which has no other jobs currently running. Each node has an equal number of open Map slots. On which node will Hadoop first attempt to run the Map task? A. The node with the most memory B. The node with the lowest system load C. The node on which this InputSplit is stored D. The node with the most free local disk space
My Answer : C Answer Given in site : A
******************************************************************************* What is a Writable? A. Writable is an interface that all keys and values in MapReduce must implement. Classes implementing this interface must implement methods forserializingand deserializing themselves. B. Writable is an abstract class that all keys and values in MapReduce must extend. Classes extending this abstract base class must implementmethods for serializing and deserializingthemselves C. Writable is an interface that all keys, but not values, in MapReduce must implement. Classes implementing this interface mustimplementmethods for serializing and deserializing themselves. D. Writable is an abstract class that all keys, but not values, in MapReduce must extend. Classes extending this abstract base class must implementmethods for serializing and deserializing themselves.
My Answer : A Answer Given in site : B
*******************************************************************************
You write a MapReduce job to process 100 files in HDFS. Your MapReducc algorithm uses TextInputFormat and the IdentityReducer: the mapper applies a regular expression over input values and emits key-value pairs with the key consisting of the matching text, and the value containing the filename and byte offset. Determine the difference between setting the number of reducers to zero. A. There is no differenceinoutput between the two settings. B. With zero reducers, no reducer runs and the job throws an exception. With one reducer, instances of matching patterns are stored in a single file on HDFS. C. With zero reducers, all instances of matching patterns are gathered together in one file on HDFS. With one reducer, instances ofmatching patternsstored in multiple files on HDFS. D. With zero reducers, instances of matching patterns are stored in multiple files on HDFS. With one reducer, all instances of matching patterns aregathered together in one file on HDFS.
My Answer : D Answer Given in site : C
*******************************************************************************
During the standard sort and shuffle phase of MapReduce, keys and values are passed to reducers. Which of the following is true? A. Keys are presented to a reducerin sorted order; values foragiven key are not sorted. B. Keys are presented to a reducer in soiled order; values for a given key are sorted in ascending order. C. Keys are presented to a reducer in random order; values for a given key are not sorted. D. Keys are presented to a reducer in random order; values for a given key are sorted in ascending order.
My Answer : A Answer Given in site : D
*******************************************************************************
Which statement best describes the data path of intermediate key-value pairs (i.e., output of the mappers)? A. Intermediate key-value pairs are written to HDFS. Reducers read the intermediate data from HDFS. B. Intermediate key-value pairs are written to HDFS. Reducers copy the intermediate data to the local disks of the machines runningthe reduce tasks. C. Intermediate key-value pairs are written to the local disks of the machines running the map tasks, and then copied to the machinerunning thereduce tasks. D. Intermediate key-value pairs are written to the local disks of the machines running the map tasks, and are then copied to HDFS. Reducers read theintermediate data from HDFS.
My Answer : C Answer Given in site : B
*******************************************************************************
You are developing a combiner that takes as input Text keys, IntWritable values, and emits Text keys, Intwritable values. Which interface should your class implement? A. Mapper <Text, IntWritable, Text, IntWritable> B. Reducer <Text, Text, IntWritable, IntWritable> C. Reducer <Text, IntWritable, Text, IntWritable> D. Combiner <Text, IntWritable, Text, IntWritable> E. Combiner <Text, Text, IntWritable, IntWritable>
My Answer : D Answer Given in site : C
*******************************************************************************
What happens in a MapReduce job when you set the number of reducers to one? A. A single reducer gathers and processes all the output from all the mappers. The output is written in as many separate files as there are mappers. B. A single reducer gathers and processes all the output from all the mappers. The output is written to a single file in HDFS. C. Setting the number of reducers to one creates a processing bottleneck, and since the number of reducers as specified by the programmer is used as areference value only, the MapReduce runtime provides a default setting for the number of reducers. D. Setting the number of reducers to one is invalid, and an exception is thrown
My Answer : B Answer Given in site : C
*******************************************************************************
In the standard word count MapReduce algorithm, why might using a combiner reduce the overall Job running time? A. Because combiners perform local aggregation of word counts, thereby allowing the mappers to process input data faster. B
-
Re: Please help on providing correct answers
Michael Segel 2012-11-07, 17:27
Ok... Where are you pulling these questions from?
Seriously. On Nov 7, 2012, at 11:21 AM, Ramasubramanian Narayanan <[EMAIL PROTECTED]> wrote:
> Hi, > > I came across the following question in some sites and the answer that they provided seems to be wrong according to me... I might be wrong... Can some one help on confirming the right answers for these 11 questions pls.. appreciate the explanation if you could able to provide... > > ******************************************************************************* > You are running a job that will process a single InputSplit on a cluster which has no other jobs > currently running. Each node has an equal number of open Map slots. On which node will Hadoop > first attempt to run the Map task? > A. The node with the most memory > B. The node with the lowest system load > C. The node on which this InputSplit is stored > D. The node with the most free local disk space > > My Answer : C > Answer Given in site : A > > ******************************************************************************* > What is a Writable? > A. Writable is an interface that all keys and values in MapReduce must implement. Classes implementing this interface must implement methods forserializingand deserializing themselves. > B. Writable is an abstract class that all keys and values in MapReduce must extend. Classes extending this abstract base class must implementmethods for serializing and deserializingthemselves > C. Writable is an interface that all keys, but not values, in MapReduce must implement. Classes implementing this interface mustimplementmethods for serializing and deserializing themselves. > D. Writable is an abstract class that all keys, but not values, in MapReduce must extend. Classes extending this abstract base class must implementmethods for serializing and deserializing themselves. > > My Answer : A > Answer Given in site : B > > ******************************************************************************* > > You write a MapReduce job to process 100 files in HDFS. Your MapReducc algorithm uses > TextInputFormat and the IdentityReducer: the mapper applies a regular expression over input > values and emits key-value pairs with the key consisting of the matching text, and the value > containing the filename and byte offset. Determine the difference between setting the number of > reducers to zero. > A. There is no differenceinoutput between the two settings. > B. With zero reducers, no reducer runs and the job throws an exception. With one reducer, > instances of matching patterns are stored in a single file on HDFS. > C. With zero reducers, all instances of matching patterns are gathered together in one file on > HDFS. With one reducer, instances ofmatching patternsstored in multiple files on HDFS. > D. With zero reducers, instances of matching patterns are stored in multiple files on HDFS. With > one reducer, all instances of matching patterns aregathered together in one file on HDFS. > > My Answer : D > Answer Given in site : C > > ******************************************************************************* > > During the standard sort and shuffle phase of MapReduce, keys and values are passed to > reducers. Which of the following is true? > A. Keys are presented to a reducerin sorted order; values foragiven key are not sorted. > B. Keys are presented to a reducer in soiled order; values for a given key are sorted in ascending > order. > C. Keys are presented to a reducer in random order; values for a given key are not sorted. > D. Keys are presented to a reducer in random order; values for a given key are sorted in > ascending order. > > My Answer : A > Answer Given in site : D > > ******************************************************************************* > > Which statement best describes the data path of intermediate key-value pairs (i.e., output of the > mappers)? > A. Intermediate key-value pairs are written to HDFS. Reducers read the intermediate data from
-
Re: Please help on providing correct answers
Ramasubramanian Narayanan... 2012-11-07, 17:30
nothing as consolidated...... I am collecting for the past 1 month... few as printout and few from mails and few from googling and few from sites and few from some of my friends...
regards, Rams
On Wed, Nov 7, 2012 at 10:57 PM, Michael Segel <[EMAIL PROTECTED]>wrote:
> Ok... > Where are you pulling these questions from? > > Seriously. > > > On Nov 7, 2012, at 11:21 AM, Ramasubramanian Narayanan < > [EMAIL PROTECTED]> wrote: > > > Hi, > > > > I came across the following question in some sites and the answer > that they provided seems to be wrong according to me... I might be wrong... > Can some one help on confirming the right answers for these 11 questions > pls.. appreciate the explanation if you could able to provide... > > > > > ******************************************************************************* > > You are running a job that will process a single InputSplit on a cluster > which has no other jobs > > currently running. Each node has an equal number of open Map slots. On > which node will Hadoop > > first attempt to run the Map task? > > A. The node with the most memory > > B. The node with the lowest system load > > C. The node on which this InputSplit is stored > > D. The node with the most free local disk space > > > > My Answer : C > > Answer Given in site : A > > > > > ******************************************************************************* > > What is a Writable? > > A. Writable is an interface that all keys and values in MapReduce must > implement. Classes implementing this interface must implement methods > forserializingand deserializing themselves. > > B. Writable is an abstract class that all keys and values in MapReduce > must extend. Classes extending this abstract base class must > implementmethods for serializing and deserializingthemselves > > C. Writable is an interface that all keys, but not values, in MapReduce > must implement. Classes implementing this interface mustimplementmethods > for serializing and deserializing themselves. > > D. Writable is an abstract class that all keys, but not values, in > MapReduce must extend. Classes extending this abstract base class must > implementmethods for serializing and deserializing themselves. > > > > My Answer : A > > Answer Given in site : B > > > > > ******************************************************************************* > > > > You write a MapReduce job to process 100 files in HDFS. Your MapReducc > algorithm uses > > TextInputFormat and the IdentityReducer: the mapper applies a regular > expression over input > > values and emits key-value pairs with the key consisting of the matching > text, and the value > > containing the filename and byte offset. Determine the difference > between setting the number of > > reducers to zero. > > A. There is no differenceinoutput between the two settings. > > B. With zero reducers, no reducer runs and the job throws an exception. > With one reducer, > > instances of matching patterns are stored in a single file on HDFS. > > C. With zero reducers, all instances of matching patterns are gathered > together in one file on > > HDFS. With one reducer, instances ofmatching patternsstored in multiple > files on HDFS. > > D. With zero reducers, instances of matching patterns are stored in > multiple files on HDFS. With > > one reducer, all instances of matching patterns aregathered together in > one file on HDFS. > > > > My Answer : D > > Answer Given in site : C > > > > > ******************************************************************************* > > > > During the standard sort and shuffle phase of MapReduce, keys and values > are passed to > > reducers. Which of the following is true? > > A. Keys are presented to a reducerin sorted order; values foragiven key > are not sorted. > > B. Keys are presented to a reducer in soiled order; values for a given > key are sorted in ascending > > order. > > C. Keys are presented to a reducer in random order; values for a given
-
Re: Please help on providing correct answers
Michael Segel 2012-11-07, 18:37
Sorry, I think I had better explain why I am curious...
First, there are a couple of sites that have study questions to help pass Cloudera's certification. ( I don't know if Hortonworks has cert tests, but both MapR and Cloudera do.)
Its just looking first at the questions... not really good questions and selection of answers. Then the 'correct' answer.
I can understand if you don't want to reveal your sources publicly, but you have to understand that misinformation found in these sites makes it harder to teach the right answers.
As Harsh says, you should be able to look at the questions and then go back to Tom White's book and others to verify why you think your answer is right.
HTH
-Mike
On Nov 7, 2012, at 11:30 AM, Ramasubramanian Narayanan <[EMAIL PROTECTED]> wrote:
> nothing as consolidated...... I am collecting for the past 1 month... few as printout and few from mails and few from googling and few from sites and few from some of my friends... > > regards, > Rams > > On Wed, Nov 7, 2012 at 10:57 PM, Michael Segel <[EMAIL PROTECTED]> wrote: > Ok... > Where are you pulling these questions from? > > Seriously. > > > On Nov 7, 2012, at 11:21 AM, Ramasubramanian Narayanan <[EMAIL PROTECTED]> wrote: > > > Hi, > > > > I came across the following question in some sites and the answer that they provided seems to be wrong according to me... I might be wrong... Can some one help on confirming the right answers for these 11 questions pls.. appreciate the explanation if you could able to provide... > > > > ******************************************************************************* > > You are running a job that will process a single InputSplit on a cluster which has no other jobs > > currently running. Each node has an equal number of open Map slots. On which node will Hadoop > > first attempt to run the Map task? > > A. The node with the most memory > > B. The node with the lowest system load > > C. The node on which this InputSplit is stored > > D. The node with the most free local disk space > > > > My Answer : C > > Answer Given in site : A > > > > ******************************************************************************* > > What is a Writable? > > A. Writable is an interface that all keys and values in MapReduce must implement. Classes implementing this interface must implement methods forserializingand deserializing themselves. > > B. Writable is an abstract class that all keys and values in MapReduce must extend. Classes extending this abstract base class must implementmethods for serializing and deserializingthemselves > > C. Writable is an interface that all keys, but not values, in MapReduce must implement. Classes implementing this interface mustimplementmethods for serializing and deserializing themselves. > > D. Writable is an abstract class that all keys, but not values, in MapReduce must extend. Classes extending this abstract base class must implementmethods for serializing and deserializing themselves. > > > > My Answer : A > > Answer Given in site : B > > > > ******************************************************************************* > > > > You write a MapReduce job to process 100 files in HDFS. Your MapReducc algorithm uses > > TextInputFormat and the IdentityReducer: the mapper applies a regular expression over input > > values and emits key-value pairs with the key consisting of the matching text, and the value > > containing the filename and byte offset. Determine the difference between setting the number of > > reducers to zero. > > A. There is no differenceinoutput between the two settings. > > B. With zero reducers, no reducer runs and the job throws an exception. With one reducer, > > instances of matching patterns are stored in a single file on HDFS. > > C. With zero reducers, all instances of matching patterns are gathered together in one file on > > HDFS. With one reducer, instances ofmatching patternsstored in multiple files on HDFS.
-
Re: Please help on providing correct answers
Ramasubramanian Narayanan... 2012-11-07, 18:54
Hi,
Have given my explanation for choosing and why I am saying given answer is wrong...
You are running a job that will process a single InputSplit on a cluster which has no other jobs currently running. Each node has an equal number of open Map slots. On which node will Hadoop first attempt to run the Map task? A. The node with the most memory B. The node with the lowest system load C. The node on which this InputSplit is stored D. The node with the most free local disk space
My Answer : C [Mapper will run on the data nodes where it has the data. So it will run the map task on the node in which the InputSplit is stored.] Answer Given in site : A [I Hope the Map task will go and check the nodes which has the most memory] ******************************************************************************* What is a Writable? A. Writable is an interface that all keys and values in MapReduce must implement. Classes implementing this interface must implement methods forserializingand deserializing themselves. B. Writable is an abstract class that all keys and values in MapReduce must extend. Classes extending this abstract base class must implementmethods for serializing and deserializingthemselves C. Writable is an interface that all keys, but not values, in MapReduce must implement. Classes implementing this interface mustimplementmethods for serializing and deserializing themselves. D. Writable is an abstract class that all keys, but not values, in MapReduce must extend. Classes extending this abstract base class must implementmethods for serializing and deserializing themselves.
My Answer : A [Writable is an interface] Answer Given in site : B [Writable is not abstract class] ******************************************************************************
You write a MapReduce job to process 100 files in HDFS. Your MapReducc algorithm uses TextInputFormat and the IdentityReducer: the mapper applies a regular expression over input values and emits key-value pairs with the key consisting of the matching text, and the value containing the filename and byte offset. Determine the difference between setting the number of reducers to zero. A. There is no differenceinoutput between the two settings. B. With zero reducers, no reducer runs and the job throws an exception. With one reducer, instances of matching patterns are stored in a single file on HDFS. C. With zero reducers, all instances of matching patterns are gathered together in one file on HDFS. With one reducer, instances ofmatching patternsstored in multiple files on HDFS. D. With zero reducers, instances of matching patterns are stored in multiple files on HDFS. With one reducer, all instances of matching patterns aregathered together in one file on HDFS.
My Answer : D [With No reducers all the output of Mappers will be directly written to HDFS. So mutiple files will be created] Answer Given in site : C [If you have one reducer then you will get one output file only not many]
*******************************************************************************
During the standard sort and shuffle phase of MapReduce, keys and values are passed to reducers. Which of the following is true? A. Keys are presented to a reducerin sorted order; values foragiven key are not sorted. B. Keys are presented to a reducer in soiled order; values for a given key are sorted in ascending order. C. Keys are presented to a reducer in random order; values for a given key are not sorted. D. Keys are presented to a reducer in random order; values for a given key are sorted in ascending order.
My Answer : A [For Reducer, Keys will be passed on Sorted order not Value. To get the value in sorted order we need to use secondary sort] Answer Given in site : D [For Reducer, Keys will be passsed only in sorted order not in random order]
*******************************************************************************
Which statement best describes the data path of intermediate key-value pairs (i.e., output of the mappers)? A. Intermediate key-value pairs are written to HDFS. Reducers read the intermediate data from HDFS. B. Intermediate key-value pairs are written to HDFS. Reducers copy the intermediate data to the local disks of the machines runningthe reduce tasks. C. Intermediate key-value pairs are written to the local disks of the machines running the map tasks, and then copied to the machinerunning thereduce tasks. D. Intermediate key-value pairs are written to the local disks of the machines running the map tasks, and are then copied to HDFS. Reducers read theintermediate data from HDFS.
My Answer : C [Intermediate key-value pairs are written in local disk and transferred to network for reducer. Once the job is completed the intermediate data will be deleted on the data node] Answer Given in site : B [Intermediate key-values will not be written to HDFS and reducer will not read from HDFS]
*******************************************************************************
You are developing a combiner that takes as input Text keys, IntWritable values, and emits Text keys, Intwritable values. Which interface should your class implement? A. Mapper <Text, IntWritable, Text, IntWritable> B. Reducer <Text, Text, IntWritable, IntWritable> C. Reducer <Text, IntWritable, Text, IntWritable> D. Combiner <Text, IntWritable, Text, IntWritable> E. Combiner <Text, Text, IntWritable, IntWritable>
My Answer : D [For developing combiner we need to use the combiner method] Answer Given in site : C [For developing combiner we need to use the combiner method only not Reducer method]
*******************************************************************************
What happens in a MapReduce job when you set the number of reducers to one? A. A single reducer gathers and processes all the output from all the mappers. The output is written in as many separate files as there are mappers. B. A single reducer gathers and pr
|
|