Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 1 to 10 from 20 (0.27s).
Loading phrases to help you
refine your search...
Re: Hadoop distcp from CDH4 to Amazon S3 - Improve Throughput - HDFS - [mail # user]
...CDH4 can be either 1.x or2.x hadoop, are you using the 2.x line? I've used it primarily with 1.0.3, which is what AWS uses, so I presume that's what it's tested on.  Himanish Kushary &n...
   Author: David Parks, 2013-03-29, 14:34
RE: Hadoop distcp from CDH4 to Amazon S3 - Improve Throughput - HDFS - [mail # user]
...None of that complexity, they distribute the jar publicly (not the source, but the jar). You can just add this to your libjars: s3n://region.elasticmapreduce/libs/s3distcp/latest/s3distcp.ja...
   Author: David Parks, 2013-03-29, 05:41
RE: Hadoop distcp from CDH4 to Amazon S3 - Improve Throughput - HDFS - [mail # user]
...Have you tried using s3distcp from amazon? I used it many times to transfer 1.5TB between S3 and Hadoop instances. The process took 45 min, well over the 10min timeout period you're running ...
   Author: David Parks, 2013-03-28, 07:56
Which hadoop installation should I use on ubuntu server? - HDFS - [mail # user]
...I'm moving off AWS MapReduce to our own cluster, I'm installing Hadoop on Ubuntu Server 12.10.     I see a .deb installer and installed that, but it seems like files are all over t...
   Author: David Parks, 2013-03-28, 06:07
For a new installation: use the BackupNode or the CheckPointNode? - HDFS - [mail # user]
...For a new installation of the current stable build (1.1.2 ), is there any reason to use the CheckPointNode over the BackupNode?      It seems that we need to choose one or the...
   Author: David Parks, 2013-03-23, 06:59
RE: On a small cluster can we double up namenode/master with tasktrackers? - HDFS - [mail # user]
...Good points all,     The mapreduce jobs are, well. intensive. We've got a whole variety, but typically I see them use a lot of CPU, a lot of Disk, and upon occasion a whole bunch o...
   Author: David Parks, 2013-03-20, 09:27
On a small cluster can we double up namenode/master with tasktrackers? - HDFS - [mail # user]
...I want 20 servers, I got 7, so I want to make the most of the 7 I have. Each of the 7 servers have: 24GB of ram, 4TB, and 8 cores.     Would it be terribly unwise of me to Run such...
   Author: David Parks, 2013-03-18, 10:24
How "Alpha" is "alpha"? - HDFS - [mail # user]
...   "This release, like previous releases in hadoop-2.x series is still considered alpha primarily since some of APIs aren't fully-baked and we expect some churn in future."   ...
   Author: David Parks, 2013-03-12, 07:56
RE: How can I limit reducers to one-per-node? - HDFS - [mail # user]
...I tried that approach at first, one domain to one reducer, but it failed me because my data set has many domains with just a few thousand images, trivial, but we also have reasonably many ma...
   Author: David Parks, 2013-02-11, 06:29
RE: Question related to Decompressor interface - HDFS - [mail # user]
...In the EncryptedWritableWrapper idea you would create an object that takes any Writable object as it's parameter.      Your EncryptedWritableWrapper would naturally implement ...
   Author: David Parks, 2013-02-11, 03:24
Sort:
project
MapReduce (38)
HDFS (20)
Hadoop (18)
HBase (2)
Pig (2)
type
mail # user (20)
date
last 7 days (0)
last 30 days (0)
last 90 days (8)
last 6 months (19)
last 9 months (20)
author
Todd Lipcon (488)
Harsh J (453)
Eli Collins (348)
Tsz Wo (168)
Aaron T. Myers (148)
Suresh Srinivas (144)
Colin Patrick McCabe (128)
Mohammad Tariq (126)
Stuti Awasthi (95)
Jing Zhao (88)
Uma Maheswara Rao G (76)
Allen Wittenauer (73)
Daryn Sharp (72)
Brandon Li (66)
Thanh Do (59)
David Parks