Re: Excuting a shell script inside the HDFS - MapReduce - [mail # user]
...Yes, that way it could work. I'm just wondering ... Why would you want to have a script like this in HDFS?  Met vriendelijk groet,  Niels Basjes Op 16 aug. 2011 06:49 schreef "Fris...
   Author: Niels Basjes, 2011-08-16, 19:00
Re: How to select random n records using mapreduce ? - MapReduce - [mail # user]
...The only solution I can think of is by creating a counter in Hadoop that is incremented each time a mapper lets a record through. As soon as the value reaches a preselected value the mappers...
   Author: Niels Basjes, 2011-06-27, 19:28
Re: AW: How to split a big file in HDFS by size - MapReduce - [mail # user]
...Hi,  On Tue, Jun 21, 2011 at 16:14, Mapred Learn  wrote: kes FS.  Have a look at this:  http://stackoverflow.com/questions/3960651/splitting-gzipped-logfiles-witho ut-sto...
   Author: Niels Basjes, 2011-06-21, 20:03
Re: How to merge several SequenceFile into one? - MapReduce - [mail # user]
...Hi,   The simplest way to do that is to create a job that - input format = sequence file - map = identity mapper - reduce = identity reduce - output = sequence file and  job.setNum...
   Author: Niels Basjes, 2011-05-25, 19:25
Including external libraries in my job. - MapReduce - [mail # user]
...Hi,  I've written my first very simple job that does something with hbase.  Now when I try to submit my jar in my cluster I get this:  [nbasjes@master ~/src/catalogloader/run]...
   Author: Niels Basjes, 2011-05-03, 13:42
Re: hadoop mr cluster mode on my laptop? - MapReduce - [mail # user]
...Hi,  You should be doing the setup for what is called "Pseudo-distributed" mode. Have a look at this: http://hadoop.apache.org/common/docs/r0.20.2/quickstart.html#PseudoDistribu ted &nb...
   Author: Niels Basjes, 2011-04-18, 13:20
Re: Small linux distros to run hadoop ? - MapReduce - [mail # user]
...Hi,  2011/4/15 web service : want to  I usually use a fully stripped CentOS 5 to run cluster nodes. Works perfectly and can be fully automated using the kickstart scripting for ana...
   Author: Niels Basjes, 2011-04-15, 14:49
Re: When use hadoop mapreduce? - MapReduce - [mail # user]
...Hi,  2011/2/17 Pedro Costa :  The summary I usually give goes something like this: IF your computation takes too long on a single system AND you can split the work up into a lot of...
   Author: Niels Basjes, 2011-02-18, 21:48
Re: Is a Block compressed (GZIP) SequenceFile splittable in MR operation? - MapReduce - [mail # user]
...Hi,  2011/1/31 Sean Bigdatafun :  Correct, gzip is a stream compression system which effectively means you can only start at the beginning of the data with decompressing.   AF...
   Author: Niels Basjes, 2011-01-31, 08:36
Re: FILE_BYTES_WRITTEN and HDFS_BYTES_WRITTEN - MapReduce - [mail # user]
...For some parts of a task the system stores information on the local (non-HDFS) file system of the node that is actually running the job. That is the FILE_.. Stuff written to HDFS is the HDFS...
   Author: Niels Basjes, 2010-11-30, 20:43
