Re: InputSplits, Serializers in Hadoop 0.20 - Hadoop - [mail # user]
...Fixed. InputSplits in 0.20 should implement Writable  On Mon, Aug 10, 2009 at 11:49 AM, Saptarshi Guha wrote: onFactory.getSerializer(SerializationFactory.java:73) Splits(JobClient.java...
   Author: Saptarshi Guha, 2009-08-10, 16:10
Re: LineReader, Buffering for FileInputFormat - Hadoop - [mail # user]
...Thank you. Is 64KB a good choice? From experience, there is a payoff between large chunks and time taken to read the chunk. I wonder if a larger value would be better.  On Sun, Aug 9, 2...
   Author: Saptarshi Guha, 2009-08-09, 23:43
Re: Running 145K maps, zero reduces- does Hadoop scale? - Hadoop - [mail # user]
...Simulation trials. Let N be the trials and T be the number of Map Tasks (==splits) Also assume there is much variation in the running time per trial. If there are ~ K=N/T (assume K is an int...
   Author: Saptarshi Guha, 2009-07-31, 14:17
Re: PCAP file format support - Hadoop - [mail # user]
...Quite true. In fact there is no such record as number of packets in a PCAP file. One has to get the filesize and divide by cumulative (plus some other things) bytes to find out what % one is...
   Author: Saptarshi Guha, 2009-07-31, 06:04
Re: Hadoop in a Heterogeneous Environment - taking advantage of different processor specs - Hadoop - [mail # user]
...Tsk tsk, silly of me. Of course I could do that.  Thanks for the confirmation Regards Saptarshi   On Tue, Jul 28, 2009 at 11:25 AM, Harish Mallipeddi  wrote:  ...
   Author: Saptarshi Guha, 2009-07-28, 15:46
Mapfileoutput format: reading in the results? - Hadoop - [mail # user]
...Hello, Not sure if I sent to this to the right email address, so here it goes again.  I am using Hadoop 0.19.2 and am experimenting with the MapFileOutputFormat. The job is complete, th...
   Author: Saptarshi Guha, 2009-07-02, 22:46
Re: Pregel - Hadoop - [mail # user]
...Hello, I don't have a  background in CS, but does MS's Dryad ( http://research.microsoft.com/en-us/projects/Dryad/ ) fit in anywhere here? Regards Saptarshi   On Fri, Jun 26, 2009 ...
   Author: Saptarshi Guha, 2009-06-26, 20:36
Re: EC2, Hadoop, copy file from CLUSTER_MASTER to CLUSTER, failing - Hadoop - [mail # user]
...Hello, Thank you. This is quite useful.  Regards Saptarshi   On Wed, Jun 24, 2009 at 6:16 AM, Tom White wrote:  wrote:...
   Author: Saptarshi Guha, 2009-06-24, 18:07
Re: When is configure and close run - Hadoop - [mail # user]
...Thank you! Just to confirm. Consider a JVM (that is being reused), has to reduce K1,{V11,V12,V13..} and K2,{V21,V22,V23,....}. Then the configure and close methods are called once each for b...
   Author: Saptarshi Guha, 2009-06-24, 17:00
Re: EC2, Max tasks, under utilized? - Hadoop - [mail # user]
...Hello, I should also point out that I'm using a SequenceFileInputFormat.  Regards Saptarshi Guha   On Tue, Jun 23, 2009 at 10:43 AM, Saptarshi Guha wrote:  ...
   Author: Saptarshi Guha, 2009-06-23, 14:50
