Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 1 to 10 from 11 (0.073s).
Loading phrases to help you
refine your search...
Re: Object in mapreduce - MapReduce - [mail # user]
...Check out the Distributed Cache feature: http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/filecache/DistributedCache.html  Kai  Am 28.12.2013 um 10:27 schrieb unmesha sre...
   Author: Kai Voigt, 2013-12-28, 09:44
Re: ALL HDFS Blocks on the Same Machine if Replication factor = 1 - MapReduce - [mail # user]
...Hello,  Am 10.06.2013 um 15:36 schrieb Razen Al Harbi :   Yes, this is normal behavior. When a HDFS client happens to run on a host that also is a DataNode (always the case when a ...
   Author: Kai Voigt, 2013-06-10, 13:47
Re: Shuffle phase replication factor - MapReduce - [mail # user]
...The map output doesn't get written to HDFS. The map task writes its output to its local disk, the reduce tasks will pull the data through HTTP for further processing.  Am 21.05.2013 um ...
   Author: Kai Voigt, 2013-05-21, 18:58
Re: Sorting Values sent to reducer NOT based on KEY (Depending on part of VALUE) - MapReduce - [mail # user]
...Hello,  the design pattern here is to emit the component you want to sort by (second field of your value in your case) as the key in the map phase.  If you also want to keep the so...
   Author: Kai Voigt, 2013-04-23, 05:54
Re: 100K Maps scenario - MapReduce - [mail # user]
...No, only one copy of each block will be processed.  If a task fails, it will be retried on another copy. Also, if speculative execution is enabled, slow tasks might be executed twice in...
   Author: Kai Voigt, 2013-04-13, 01:48
Re: Reduce starts before map completes (at 23%) - MapReduce - [mail # user]
...It's the reduce JVMs that get started while the map phase is still active. After the first blocks have been processed by the mappers, that output gets pulled by the reduce JVMs while the map...
   Author: Kai Voigt, 2013-04-12, 04:18
[expand - 1 more] - Re: distributed cache - MapReduce - [mail # user]
...Hi,  simple math. Assuming you have n TaskTrackers in your cluster that will need to access the files in the distributed cache. And r is the replication level of those files.  Copy...
   Author: Kai Voigt, 2012-12-22, 12:51
Re: block size - MapReduce - [mail # user]
...Hi,  Am 20.11.2012 um 17:31 schrieb "Kartashov, Andy" :   the blocksize affects new files only, existing files will not be modified. As you said, you need to re-import those old fi...
   Author: Kai Voigt, 2012-11-20, 16:33
[expand - 1 more] - Re: a question on NameNode - MapReduce - [mail # user]
...Hi,  Am 19.11.2012 um 16:14 schrieb "Kartashov, Andy" :   the JobTracker will schedule the map task on one node only initially. There's no need to launch the task on all nodes that...
   Author: Kai Voigt, 2012-11-19, 15:19
Re: What's the basic idea of pseudo-distributed Hadoop ? - MapReduce - [mail # user]
...Hello.  Am 14.09.2012 um 08:03 schrieb Jason Yang :  work: cluster, does the hadoop run each mapper in sequence ? or does it run  these mappers in different threads or somethi...
   Author: Kai Voigt, 2012-09-14, 06:08
Sort:
project
Hadoop (29)
MapReduce (11)
HDFS (5)
Hive (1)
Pig (1)
Sqoop (1)
type
mail # user (11)
date
last 7 days (0)
last 30 days (0)
last 90 days (0)
last 6 months (0)
last 9 months (11)
author
Harsh J (454)
Arun C Murthy (326)
Vinod Kumar Vavilapalli (309)
Todd Lipcon (223)
Amar Kamat (181)
Thomas Graves (165)
Jason Lowe (161)
Amareshwari Sriramadasu (152)
Sandy Ryza (124)
Tom White (111)
Siddharth Seth (109)
Aaron Kimball (107)
Owen O'Malley (105)
Alejandro Abdelnur (103)
Devaraj K (103)
Ramya Sunil (103)
Robert Joseph Evans (101)
Hemanth Yamijala (97)
Steve Loughran (90)
Ted Yu (78)
Eli Collins (77)
Ravi Gummadi (76)
Karthik Kambatla (71)
Mahadev konar (67)
Ravi Prakash (66)
Kai Voigt