Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> how to write custom object using M/R


Copy link to this message
-
Re: how to write custom object using M/R
I assumed you were already doing this but yes, Alain is correct, you
need to set the output format too.

I initialize writing to sequence files like so:

job.setOutputFormatClass(SequenceFileOutputFormat.class);
FileOutputFormat.setOutputName(job, dataSourceName);
FileOutputFormat.setOutputPath(job, hdfsJobOutputPath);
FileOutputFormat.setCompressOutput(job, true);
FileOutputFormat.setOutputCompressorClass(job, DefaultCodec.class);
SequenceFileOutputFormat.setOutputCompressionType(job,
SequenceFile.CompressionType.BLOCK);

DR
On 01/14/2011 01:27 PM, MONTMORY Alain wrote:
> Hi,
>
> I think you have to put :
>              job.setOutputFormatClass(SequenceFileOutputFormat.class);
> to make it works..
> hopes this help
>
> Alain
>
> [@@THALES GROUP RESTRICTED@@]
>
> De : Joan [mailto:[EMAIL PROTECTED]]
> Envoyé : vendredi 14 janvier 2011 13:58
> À : mapreduce-user
> Objet : how to write custom object using M/R
>
> Hi,
>
> I'm trying to write (K,V) where K is a Text object and V's CustomObject. But It doesn't run.
>
> I'm configuring output job like: SequenceFileInputFormat so I have job with:
>
>          job.setMapOutputKeyClass(Text.class);
>          job.setMapOutputValueClass(CustomObject.class);
>          job.setOutputKeyClass(Text.class);
>          job.setOutputValueClass(CustomObject.class);
>
>          SequenceFileOutputFormat.setOutputPath(job, new Path("myPath");
>
> And I obtain the next output (this is a file: part-r-00000):
>
> K  CustomObject@2b237512
> K  CustomObject@24db06de
> ...
>
> When this job finished I run other job which input is SequenceFileInputFormat but It doesn't run:
>
> The configuration's second job is:
>
>          job.setInputFormatClass(SequenceFileInputFormat.class);
>          SequenceFileInputFormat.addInputPath(job, new Path("myPath"));
>
> But I get an error:
>
> java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000 not a SequenceFile
>          at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523)
>          at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483)
>          at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451)
>          at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432)
>          at org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60)
>
>
> Can someone help me? Because I don't understand it. I don't know to save my object in first M/R and how to use it in second M/R
>
> Thanks
>
> Joan
>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB