Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 11 to 20 from 38 (0.196s).
Loading phrases to help you
refine your search...
Re: Configuring SSH - is it required? for a psedo distriburted mode? - MapReduce - [mail # user]
...I never configure the ssh feature. Not for running on a single node and not for a full size cluster. I simply start all the required deamons (name/data/job/task) and configure them on which ...
   Author: Niels Basjes, 2013-05-19, 09:03
Re: How to process only input files containing 100% valid rows - MapReduce - [mail # user]
...How about a different approach: If you use the multiple output option you can process the valid lines in a normal way and put the invalid lines in a special separate output file. On Apr 18, ...
   Author: Niels Basjes, 2013-04-19, 08:21
Re: how to find top N values using map-reduce ? - MapReduce - [mail # user]
...My suggestion is to use secondary sort with a single reducer. That easy you can easily extract the top N. If you want to get the top N% you'll need an additional phase to determine how many ...
   Author: Niels Basjes, 2013-02-02, 12:44
Re: What is the preferred way to pass a small number of configuration parameters to a mapper or reducer - MapReduce - [mail # user]
...F. put a mongodb replica set on all hadoop workernodes and let the tasks query the mongodb at localhost.  (this is what I did recently with a multi GiB dataset)  Met vriendelijke g...
   Author: Niels Basjes, 2012-12-30, 19:38
Re: Doubts on compressed file - MapReduce - [mail # user]
...Hi,   Yes.   Yes, and then the mapper will read the other parts of the file over the network. So what I do is I upload such files with a bigger HDFS blocksize so the mapper has "th...
   Author: Niels Basjes, 2012-11-07, 12:47
[expand - 1 more] - Re: Hadoop Real time help - MapReduce - [mail # user]
...Thanks for the pointers, I have stuff to read now :)  On Mon, Aug 20, 2012 at 9:37 AM, Bertrand Dechoux  wrote:    Best regards / Met vriendelijke groeten,  Niels Ba...
   Author: Niels Basjes, 2012-08-22, 18:21
Re: output/input ratio > 1 for map tasks? - MapReduce - [mail # user]
...Hi,  On Mon, Jul 30, 2012 at 8:47 PM, brisk  wrote:  For a simple example: Have a look at the WordCount example.  Input of a single map call is 1 record: "This is a line"...
   Author: Niels Basjes, 2012-07-30, 20:15
Making gzip splittable for Hadoop - MapReduce - [mail # user]
...Hi,  In many Hadoop production environments you get gzipped files as the raw input. Usually these are Apache HTTPD logfiles. When putting these gzipped files into Hadoop you are stuck w...
   Author: Niels Basjes, 2012-03-30, 14:07
[expand - 3 more] - Re: Merge sorting reduce output files - MapReduce - [mail # user]
...Hi,  On Thu, Mar 1, 2012 at 00:07, Robert Evans  wrote:   No worries.    What we have has a lot more features. Yet the basic idea of what we have is similar enough t...
   Author: Niels Basjes, 2012-03-01, 14:23
Should splittable Gzip be a "core" hadoop feature? - MapReduce - [mail # user]
...Hi,  Some time ago I had an idea and implemented it.  Normally you can only run a single gzipped input file through a single mapper and thus only on a single CPU core. What I creat...
   Author: Niels Basjes, 2012-02-28, 15:50
Hadoop (41)
MapReduce (35)
Pig (11)
HBase (6)
HDFS (3)
Cassandra (1)
mail # user (34)
issue (3)
mail # dev (1)
last 7 days (1)
last 30 days (1)
last 90 days (4)
last 6 months (4)
last 9 months (38)
Harsh J (454)
Arun C Murthy (326)
Vinod Kumar Vavilapalli (309)
Todd Lipcon (215)
Amar Kamat (181)
Thomas Graves (165)
Jason Lowe (159)
Amareshwari Sriramadasu (152)
Sandy Ryza (124)
Tom White (111)
Siddharth Seth (109)
Aaron Kimball (107)
Owen O'Malley (105)
Devaraj K (103)
Ramya Sunil (103)
Alejandro Abdelnur (102)
Robert Joseph Evans (101)
Hemanth Yamijala (97)
Steve Loughran (90)
Ted Yu (78)
Eli Collins (77)
Ravi Gummadi (76)
Karthik Kambatla (70)
Mahadev konar (67)
Ravi Prakash (66)
Niels Basjes