Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 11 to 20 from 35 (0.156s).
Loading phrases to help you
refine your search...
Re: What is the preferred way to pass a small number of configuration parameters to a mapper or reducer - MapReduce - [mail # user]
...F. put a mongodb replica set on all hadoop workernodes and let the tasks query the mongodb at localhost.  (this is what I did recently with a multi GiB dataset)  Met vriendelijke g...
   Author: Niels Basjes, 2012-12-30, 19:38
Re: Doubts on compressed file - MapReduce - [mail # user]
...Hi,   Yes.   Yes, and then the mapper will read the other parts of the file over the network. So what I do is I upload such files with a bigger HDFS blocksize so the mapper has "th...
   Author: Niels Basjes, 2012-11-07, 12:47
Re: Hadoop Real time help - MapReduce - [mail # user]
...Thanks for the pointers, I have stuff to read now :)  On Mon, Aug 20, 2012 at 9:37 AM, Bertrand Dechoux  wrote:    Best regards / Met vriendelijke groeten,  Niels Ba...
   Author: Niels Basjes, 2012-08-22, 18:21
Re: output/input ratio > 1 for map tasks? - MapReduce - [mail # user]
...Hi,  On Mon, Jul 30, 2012 at 8:47 PM, brisk  wrote:  For a simple example: Have a look at the WordCount example.  Input of a single map call is 1 record: "This is a line"...
   Author: Niels Basjes, 2012-07-30, 20:15
Making gzip splittable for Hadoop - MapReduce - [mail # user]
...Hi,  In many Hadoop production environments you get gzipped files as the raw input. Usually these are Apache HTTPD logfiles. When putting these gzipped files into Hadoop you are stuck w...
   Author: Niels Basjes, 2012-03-30, 14:07
Re: Merge sorting reduce output files - MapReduce - [mail # user]
...Hi,  On Thu, Mar 1, 2012 at 00:07, Robert Evans  wrote:   No worries.    What we have has a lot more features. Yet the basic idea of what we have is similar enough t...
   Author: Niels Basjes, 2012-03-01, 14:23
Should splittable Gzip be a "core" hadoop feature? - MapReduce - [mail # user]
...Hi,  Some time ago I had an idea and implemented it.  Normally you can only run a single gzipped input file through a single mapper and thus only on a single CPU core. What I creat...
   Author: Niels Basjes, 2012-02-28, 15:50
Re: unsort algorithmus in map/reduce - MapReduce - [mail # user]
...Why not do something very simple: Use the MD5 of the URL as the key you do the sorting by. This scales very easy and highly randomized order. Maybe not the optimal maximum distance, but cert...
   Author: Niels Basjes, 2011-10-25, 12:21
Re: output from one map reduce job as the input to another map reduce job? - MapReduce - [mail # user]
...To me it sounds like the asker should checkout tools like storm and s4 instead of hadoop.  http://www.infoq.com/news/2011/09/twitter-storm-real-time-hadoop  Met vriendelijke groet,...
   Author: Niels Basjes, 2011-09-28, 07:21
Re: How to Create an effective chained MapReduce program. - MapReduce - [mail # user]
...Hi,  In the past i've had the same situation where I needed the data for debugging. Back then I chose to create a second job with simply SequenceFileInputFormat, IdentityMapper, Identit...
   Author: Niels Basjes, 2011-09-06, 05:57
Hadoop (38)
MapReduce (33)
Pig (11)
HBase (6)
HDFS (3)
Cassandra (1)
mail # user (34)
issue (1)
last 7 days (0)
last 30 days (0)
last 90 days (0)
last 6 months (1)
last 9 months (35)
Harsh J (454)
Arun C Murthy (325)
Vinod Kumar Vavilapalli (307)
Todd Lipcon (197)
Amar Kamat (180)
Thomas Graves (164)
Amareshwari Sriramadasu (153)
Jason Lowe (150)
Owen O'Malley (126)
Sandy Ryza (123)
Tom White (111)
Siddharth Seth (109)
Aaron Kimball (107)
Ramya Sunil (103)
Devaraj K (102)
Niels Basjes