Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 11 to 20 from 28 (0.06s).
Loading phrases to help you
refine your search...
How do _you_ document your hadoop jobs? - MapReduce - [mail # user]
...We've taken to documenting our Hadoop jobs in a simple visual manner using PPT (attached example). I wonder how others document their jobs?     We often add notes to the text secti...
   Author: David Parks, 2013-02-25, 09:11
[expand - 2 more] - RE: How can I limit reducers to one-per-node? - MapReduce - [mail # user]
...Looking at the Job File for my job I see that this property is set to 1, however I have 3 reducers per node (I’m not clear what configuration is causing this behavior).     My prob...
   Author: David Parks, 2013-02-09, 04:46
[expand - 1 more] - RE: Tricks to upgrading Sequence Files? - MapReduce - [mail # user]
...I'll consider a patch to the SequenceFile, if we could manually override the sequence file input Key and Value that's read from the sequence file headers we'd have a clean solution.  I ...
   Author: David Parks, 2013-01-30, 02:17
Symbolic links available in 1.0.3? - MapReduce - [mail # user]
...Is it possible to use symbolic links in 1.0.3?     If yes: can I use symbolic links to create a single, final directory structure of files from many locations; then use DistCp/S3Di...
   Author: David Parks, 2013-01-29, 03:31
RE: Fastest way to transfer files - MapReduce - [mail # user]
...Here’s an example of running distcp (actually in this case s3distcp, but it’s about the same, just new DistCp()) from java:     ToolRunner.run(getConf(), new S3DistCp(), new String...
   Author: David Parks, 2012-12-29, 10:29
What does mapred.map.tasksperslot do? - MapReduce - [mail # user]
...I didn't come up with much in a google search.     In particular, what are the side effects of changing this setting? Memory? Sort process?     I'm guessing it means that...
   Author: David Parks, 2012-12-27, 08:21
How to troubleshoot OutOfMemoryError - MapReduce - [mail # user]
...I'm pretty consistently seeing a few reduce tasks fail with OutOfMemoryError (below). It doesn't kill the job, but it slows it down.     In my current case the reducer is pretty da...
   Author: David Parks, 2012-12-22, 04:33
OutOfMemory in ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory - MapReduce - [mail # user]
...I've got 15 boxes in a cluster, 7.5GB of ram each on AWS (m1.large), 1 reducer per node.     I'm seeing this exception sometimes. It's not stopping the job from completing, it's ju...
   Author: David Parks, 2012-12-17, 05:36
How to submit Tool jobs programatically in parallel? - MapReduce - [mail # user]
...I'm submitting unrelated jobs programmatically (using AWS EMR) so they run in parallel.  I'd like to run an s3distcp job in parallel as well, but the interface to that job is a Tool, e....
   Author: David Parks, 2012-12-14, 04:39
RE: Shuffle's getMapOutput() fails with EofException, followed by IllegalStateException - MapReduce - [mail # user]
...If anyone follows this thread in the future, it turns out that I was being lead astray by these errors, they weren't the cause of the problem. This was the resolution:  http://stackover...
   Author: David Parks, 2012-12-14, 04:25
MapReduce (21)
Hadoop (14)
HDFS (11)
Pig (3)
HBase (1)
mail # user (28)
last 7 days (0)
last 30 days (0)
last 90 days (0)
last 6 months (0)
last 9 months (28)
Harsh J (454)
Arun C Murthy (326)
Vinod Kumar Vavilapalli (309)
Todd Lipcon (223)
Amar Kamat (181)
Thomas Graves (166)
Jason Lowe (162)
Amareshwari Sriramadasu (152)
Sandy Ryza (124)
Tom White (111)
Siddharth Seth (109)
Aaron Kimball (107)
Owen O'Malley (105)
Alejandro Abdelnur (103)
Devaraj K (103)
Ramya Sunil (103)
Robert Joseph Evans (101)
Hemanth Yamijala (97)
Steve Loughran (90)
Ted Yu (80)
Eli Collins (77)
Ravi Gummadi (76)
Karthik Kambatla (71)
Mahadev konar (67)
Ravi Prakash (66)
David Parks