Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 1 to 10 from 11 (0.103s).
Loading phrases to help you
refine your search...
Re: Shuffle phase replication factor - MapReduce - [mail # user]
...The map output doesn't get written to HDFS. The map task writes its output to its local disk, the reduce tasks will pull the data through HTTP for further processing.  Am 21.05.2013 um ...
   Author: Kai Voigt, 2013-05-21, 18:58
Re: Sorting Values sent to reducer NOT based on KEY (Depending on part of VALUE) - MapReduce - [mail # user]
...Hello,  the design pattern here is to emit the component you want to sort by (second field of your value in your case) as the key in the map phase.  If you also want to keep the so...
   Author: Kai Voigt, 2013-04-23, 05:54
Re: 100K Maps scenario - MapReduce - [mail # user]
...No, only one copy of each block will be processed.  If a task fails, it will be retried on another copy. Also, if speculative execution is enabled, slow tasks might be executed twice in...
   Author: Kai Voigt, 2013-04-13, 01:48
Re: Reduce starts before map completes (at 23%) - MapReduce - [mail # user]
...It's the reduce JVMs that get started while the map phase is still active. After the first blocks have been processed by the mappers, that output gets pulled by the reduce JVMs while the map...
   Author: Kai Voigt, 2013-04-12, 04:18
Re: distributed cache - MapReduce - [mail # user]
...Hi,  simple math. Assuming you have n TaskTrackers in your cluster that will need to access the files in the distributed cache. And r is the replication level of those files.  Copy...
   Author: Kai Voigt, 2012-12-22, 12:51
Re: distributed cache - MapReduce - [mail # user]
...Hi,  Am 22.12.2012 um 13:03 schrieb Lin Ma :    Yes, you are correct. The JobTracker will put files for the distributed cache into HDFS with a higher replication count (10 by ...
   Author: Kai Voigt, 2012-12-22, 12:44
Re: block size - MapReduce - [mail # user]
...Hi,  Am 20.11.2012 um 17:31 schrieb "Kartashov, Andy" :   the blocksize affects new files only, existing files will not be modified. As you said, you need to re-import those old fi...
   Author: Kai Voigt, 2012-11-20, 16:33
Re: a question on NameNode - MapReduce - [mail # user]
...Hi,  Am 19.11.2012 um 16:14 schrieb "Kartashov, Andy" :   the JobTracker will schedule the map task on one node only initially. There's no need to launch the task on all nodes that...
   Author: Kai Voigt, 2012-11-19, 15:19
Re: a question on NameNode - MapReduce - [mail # user]
...Am 19.11.2012 um 15:43 schrieb "Kartashov, Andy" :   One major feature of HDFS is its redundancy. Blocks are stored more than once (three times by default), so chances are good that ano...
   Author: Kai Voigt, 2012-11-19, 15:01
Re: What's the basic idea of pseudo-distributed Hadoop ? - MapReduce - [mail # user]
...Hello.  Am 14.09.2012 um 08:03 schrieb Jason Yang :  work: cluster, does the hadoop run each mapper in sequence ? or does it run  these mappers in different threads or somethi...
   Author: Kai Voigt, 2012-09-14, 06:08
Sort:
project
Hadoop (34)
MapReduce (11)
HDFS (4)
Pig (1)
Sqoop (1)
type
mail # user (11)
date
last 7 days (1)
last 30 days (2)
last 90 days (4)
last 6 months (6)
last 9 months (11)
author
Harsh J (1037)
Arun C Murthy (500)
Vinod Kumar Vavilapalli (351)
Todd Lipcon (283)
Amar Kamat (184)
Mohammad Tariq (174)
Thomas Graves (173)
Owen O'Malley (162)
Hemanth Yamijala (155)
Amareshwari Sriramadasu (153)
Pedro Costa (153)
Ted Yu (148)
Robert Evans (146)
Tom White (138)
Aaron Kimball (131)
Kai Voigt