Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> how to write custom object using M/R


Copy link to this message
-
Re: how to write custom object using M/R
Sounds to me like your custom object isn't serializing properly.

You might want to read up on how to do it correctly here:
http://developer.yahoo.com/hadoop/tutorial/module5.html#types

FYI - here's an example of a custom type I wrote, which I'm able to
read/write successfully to/from a sequence file:
public class UserStateRecordWritable implements Writable {

public UserStateRecordWritable() {
recordType = new Text();
recordData = new BytesWritable();
}

public void readFields(DataInput in) throws IOException {
recordType.readFields(in);
recordData.readFields(in);
}

public void write(DataOutput out) throws IOException {
recordType.write(out);
recordData.write(out);
}

public void set(Text newRecordType, BytesWritable newRecordData) {
recordType.set(newRecordType);
recordData.set(newRecordData);
}

public Text getRecordType() {
return recordType;
}

public BytesWritable getRecordData() {
return recordData;
}

public String copyRecordType() {
return recordType.toString();
}

public byte[] copyRecordData() {
return TraitWeightUtils.getBytes(recordData);
}

private Text recordType;
private BytesWritable recordData;
}
HTH,

DR

On 01/14/2011 07:57 AM, Joan wrote:
> Hi,
>
> I'm trying to write (K,V) where K is a Text object and V's CustomObject. But
> It doesn't run.
>
> I'm configuring output job like: SequenceFileInputFormat so I have job with:
>
>          job.setMapOutputKeyClass(Text.class);
>          job.setMapOutputValueClass(CustomObject.class);
>          job.setOutputKeyClass(Text.class);
>          job.setOutputValueClass(CustomObject.class);
>
>          SequenceFileOutputFormat.setOutputPath(job, new Path("myPath");
>
> And I obtain the next output (this is a file: part-r-00000):
>
> K  CustomObject@2b237512
> K  CustomObject@24db06de
> ...
>
> When this job finished I run other job which input is
> SequenceFileInputFormat but It doesn't run:
>
> The configuration's second job is:
>
>          job.setInputFormatClass(SequenceFileInputFormat.class);
>          SequenceFileInputFormat.addInputPath(job, new Path("myPath"));
>
> But I get an error:
>
> java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000 not
> a SequenceFile
>          at
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523)
>          at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483)
>          at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451)
>          at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432)
>          at
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60)
>
>
> Can someone help me? Because I don't understand it. I don't know to save my
> object in first M/R and how to use it in second M/R
>
> Thanks
>
> Joan
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB