|
Joan
2011-01-14, 12:57
MONTMORY Alain
2011-01-14, 18:27
Joan
2011-01-17, 08:19
Harsh J
2011-01-17, 09:13
Lance Norskog
2011-01-17, 09:51
David Rosenstrauch
2011-01-18, 18:49
David Rosenstrauch
2011-01-18, 18:56
Joan
2011-01-19, 14:36
Joan
2011-01-19, 14:42
Joan
2011-01-19, 14:50
David Rosenstrauch
2011-01-19, 20:04
|
-
how to write custom object using M/RJoan 2011-01-14, 12:57
Hi,
I'm trying to write (K,V) where K is a Text object and V's CustomObject. But It doesn't run. I'm configuring output job like: SequenceFileInputFormat so I have job with: job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(CustomObject.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(CustomObject.class); SequenceFileOutputFormat.setOutputPath(job, new Path("myPath"); And I obtain the next output (this is a file: part-r-00000): K CustomObject@2b237512 K CustomObject@24db06de ... When this job finished I run other job which input is SequenceFileInputFormat but It doesn't run: The configuration's second job is: job.setInputFormatClass(SequenceFileInputFormat.class); SequenceFileInputFormat.addInputPath(job, new Path("myPath")); But I get an error: java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000 not a SequenceFile at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432) at org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60) Can someone help me? Because I don't understand it. I don't know to save my object in first M/R and how to use it in second M/R Thanks Joan
-
RE: how to write custom object using M/RMONTMORY Alain 2011-01-14, 18:27
Hi,
I think you have to put : job.setOutputFormatClass(SequenceFileOutputFormat.class); to make it works.. hopes this help Alain [@@THALES GROUP RESTRICTED@@] De : Joan [mailto:[EMAIL PROTECTED]] Envoyé : vendredi 14 janvier 2011 13:58 À : mapreduce-user Objet : how to write custom object using M/R Hi, I'm trying to write (K,V) where K is a Text object and V's CustomObject. But It doesn't run. I'm configuring output job like: SequenceFileInputFormat so I have job with: job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(CustomObject.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(CustomObject.class); SequenceFileOutputFormat.setOutputPath(job, new Path("myPath"); And I obtain the next output (this is a file: part-r-00000): K CustomObject@2b237512 K CustomObject@24db06de ... When this job finished I run other job which input is SequenceFileInputFormat but It doesn't run: The configuration's second job is: job.setInputFormatClass(SequenceFileInputFormat.class); SequenceFileInputFormat.addInputPath(job, new Path("myPath")); But I get an error: java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000 not a SequenceFile at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432) at org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60) Can someone help me? Because I don't understand it. I don't know to save my object in first M/R and how to use it in second M/R Thanks Joan
-
Re: how to write custom object using M/RJoan 2011-01-17, 08:19
Hi Alain,
I put it, but It didn't work. Joan 2011/1/14 MONTMORY Alain <[EMAIL PROTECTED]> > Hi, > > > > I think you have to put : > > job.setOutputFormatClass(SequenceFileOutputFormat.*class*); > > to make it works.. > > hopes this help > > > > Alain > > > > [@@THALES GROUP RESTRICTED@@] > > > > *De :* Joan [mailto:[EMAIL PROTECTED]] > *Envoyé :* vendredi 14 janvier 2011 13:58 > *À :* mapreduce-user > *Objet :* how to write custom object using M/R > > > > Hi, > > I'm trying to write (K,V) where K is a Text object and V's CustomObject. > But It doesn't run. > > I'm configuring output job like: SequenceFileInputFormat so I have job > with: > > job.setMapOutputKeyClass(Text.class); > job.setMapOutputValueClass(CustomObject.class); > job.setOutputKeyClass(Text.class); > job.setOutputValueClass(CustomObject.class); > > SequenceFileOutputFormat.setOutputPath(job, new Path("myPath"); > > And I obtain the next output (this is a file: part-r-00000): > > K CustomObject@2b237512 > K CustomObject@24db06de > ... > > When this job finished I run other job which input is > SequenceFileInputFormat but It doesn't run: > > The configuration's second job is: > > job.setInputFormatClass(SequenceFileInputFormat.class); > SequenceFileInputFormat.addInputPath(job, new Path("myPath")); > > But I get an error: > > java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000 > not a SequenceFile > at > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432) > at > org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60) > > > Can someone help me? Because I don't understand it. I don't know to save my > object in first M/R and how to use it in second M/R > > Thanks > > Joan > > >
-
Re: how to write custom object using M/RHarsh J 2011-01-17, 09:13
1. Your first Job's OutputFormat must be set to SequenceFileOutputFormat
2. Your "custom" object must implement the Writable interface properly (as in, the readFields() and write() methods must work as expected by the framework and your requirements). The fact that your output is like "K CustomObject@2b237512" shows that the custom object isn't serializing properly (toString() is probably being called without a special implementation?) On Mon, Jan 17, 2011 at 1:49 PM, Joan <[EMAIL PROTECTED]> wrote: > Hi Alain, > > I put it, but It didn't work. > > Joan > > 2011/1/14 MONTMORY Alain <[EMAIL PROTECTED]> >> >> Hi, >> >> >> >> I think you have to put : >> >> job.setOutputFormatClass(SequenceFileOutputFormat.class); >> >> to make it works.. >> >> hopes this help >> >> >> >> Alain >> >> >> >> [@@THALES GROUP RESTRICTED@@] >> >> >> >> De : Joan [mailto:[EMAIL PROTECTED]] >> Envoyé : vendredi 14 janvier 2011 13:58 >> À : mapreduce-user >> Objet : how to write custom object using M/R >> >> >> >> Hi, >> >> I'm trying to write (K,V) where K is a Text object and V's CustomObject. >> But It doesn't run. >> >> I'm configuring output job like: SequenceFileInputFormat so I have job >> with: >> >> job.setMapOutputKeyClass(Text.class); >> job.setMapOutputValueClass(CustomObject.class); >> job.setOutputKeyClass(Text.class); >> job.setOutputValueClass(CustomObject.class); >> >> SequenceFileOutputFormat.setOutputPath(job, new Path("myPath"); >> >> And I obtain the next output (this is a file: part-r-00000): >> >> K CustomObject@2b237512 >> K CustomObject@24db06de >> ... >> >> When this job finished I run other job which input is >> SequenceFileInputFormat but It doesn't run: >> >> The configuration's second job is: >> >> job.setInputFormatClass(SequenceFileInputFormat.class); >> SequenceFileInputFormat.addInputPath(job, new Path("myPath")); >> >> But I get an error: >> >> java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000 >> not a SequenceFile >> at >> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523) >> at >> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483) >> at >> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451) >> at >> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432) >> at >> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60) >> >> >> Can someone help me? Because I don't understand it. I don't know to save >> my object in first M/R and how to use it in second M/R >> >> Thanks >> >> Joan >> >> > > -- Harsh J www.harshj.com
-
Re: how to write custom object using M/RLance Norskog 2011-01-17, 09:51
Does you custom object have Writable implemented? Also, does it have
toString() implemented? I think this means the Writable code does not work: K CustomObject@2b237512 K CustomObject@24db06de This is Java's default toString() method. On Mon, Jan 17, 2011 at 12:19 AM, Joan <[EMAIL PROTECTED]> wrote: > Hi Alain, > > I put it, but It didn't work. > > Joan > > 2011/1/14 MONTMORY Alain <[EMAIL PROTECTED]> >> >> Hi, >> >> >> >> I think you have to put : >> >> job.setOutputFormatClass(SequenceFileOutputFormat.class); >> >> to make it works.. >> >> hopes this help >> >> >> >> Alain >> >> >> >> [@@THALES GROUP RESTRICTED@@] >> >> >> >> De : Joan [mailto:[EMAIL PROTECTED]] >> Envoyé : vendredi 14 janvier 2011 13:58 >> À : mapreduce-user >> Objet : how to write custom object using M/R >> >> >> >> Hi, >> >> I'm trying to write (K,V) where K is a Text object and V's CustomObject. >> But It doesn't run. >> >> I'm configuring output job like: SequenceFileInputFormat so I have job >> with: >> >> job.setMapOutputKeyClass(Text.class); >> job.setMapOutputValueClass(CustomObject.class); >> job.setOutputKeyClass(Text.class); >> job.setOutputValueClass(CustomObject.class); >> >> SequenceFileOutputFormat.setOutputPath(job, new Path("myPath"); >> >> And I obtain the next output (this is a file: part-r-00000): >> >> K CustomObject@2b237512 >> K CustomObject@24db06de >> ... >> >> When this job finished I run other job which input is >> SequenceFileInputFormat but It doesn't run: >> >> The configuration's second job is: >> >> job.setInputFormatClass(SequenceFileInputFormat.class); >> SequenceFileInputFormat.addInputPath(job, new Path("myPath")); >> >> But I get an error: >> >> java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000 >> not a SequenceFile >> at >> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523) >> at >> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483) >> at >> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451) >> at >> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432) >> at >> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60) >> >> >> Can someone help me? Because I don't understand it. I don't know to save >> my object in first M/R and how to use it in second M/R >> >> Thanks >> >> Joan >> >> > > -- Lance Norskog [EMAIL PROTECTED]
-
Re: how to write custom object using M/RDavid Rosenstrauch 2011-01-18, 18:49
Sounds to me like your custom object isn't serializing properly.
You might want to read up on how to do it correctly here: http://developer.yahoo.com/hadoop/tutorial/module5.html#types FYI - here's an example of a custom type I wrote, which I'm able to read/write successfully to/from a sequence file: public class UserStateRecordWritable implements Writable { public UserStateRecordWritable() { recordType = new Text(); recordData = new BytesWritable(); } public void readFields(DataInput in) throws IOException { recordType.readFields(in); recordData.readFields(in); } public void write(DataOutput out) throws IOException { recordType.write(out); recordData.write(out); } public void set(Text newRecordType, BytesWritable newRecordData) { recordType.set(newRecordType); recordData.set(newRecordData); } public Text getRecordType() { return recordType; } public BytesWritable getRecordData() { return recordData; } public String copyRecordType() { return recordType.toString(); } public byte[] copyRecordData() { return TraitWeightUtils.getBytes(recordData); } private Text recordType; private BytesWritable recordData; } HTH, DR On 01/14/2011 07:57 AM, Joan wrote: > Hi, > > I'm trying to write (K,V) where K is a Text object and V's CustomObject. But > It doesn't run. > > I'm configuring output job like: SequenceFileInputFormat so I have job with: > > job.setMapOutputKeyClass(Text.class); > job.setMapOutputValueClass(CustomObject.class); > job.setOutputKeyClass(Text.class); > job.setOutputValueClass(CustomObject.class); > > SequenceFileOutputFormat.setOutputPath(job, new Path("myPath"); > > And I obtain the next output (this is a file: part-r-00000): > > K CustomObject@2b237512 > K CustomObject@24db06de > ... > > When this job finished I run other job which input is > SequenceFileInputFormat but It doesn't run: > > The configuration's second job is: > > job.setInputFormatClass(SequenceFileInputFormat.class); > SequenceFileInputFormat.addInputPath(job, new Path("myPath")); > > But I get an error: > > java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000 not > a SequenceFile > at > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432) > at > org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60) > > > Can someone help me? Because I don't understand it. I don't know to save my > object in first M/R and how to use it in second M/R > > Thanks > > Joan >
-
Re: how to write custom object using M/RDavid Rosenstrauch 2011-01-18, 18:56
I assumed you were already doing this but yes, Alain is correct, you
need to set the output format too. I initialize writing to sequence files like so: job.setOutputFormatClass(SequenceFileOutputFormat.class); FileOutputFormat.setOutputName(job, dataSourceName); FileOutputFormat.setOutputPath(job, hdfsJobOutputPath); FileOutputFormat.setCompressOutput(job, true); FileOutputFormat.setOutputCompressorClass(job, DefaultCodec.class); SequenceFileOutputFormat.setOutputCompressionType(job, SequenceFile.CompressionType.BLOCK); DR On 01/14/2011 01:27 PM, MONTMORY Alain wrote: > Hi, > > I think you have to put : > job.setOutputFormatClass(SequenceFileOutputFormat.class); > to make it works.. > hopes this help > > Alain > > [@@THALES GROUP RESTRICTED@@] > > De : Joan [mailto:[EMAIL PROTECTED]] > Envoyé : vendredi 14 janvier 2011 13:58 > À : mapreduce-user > Objet : how to write custom object using M/R > > Hi, > > I'm trying to write (K,V) where K is a Text object and V's CustomObject. But It doesn't run. > > I'm configuring output job like: SequenceFileInputFormat so I have job with: > > job.setMapOutputKeyClass(Text.class); > job.setMapOutputValueClass(CustomObject.class); > job.setOutputKeyClass(Text.class); > job.setOutputValueClass(CustomObject.class); > > SequenceFileOutputFormat.setOutputPath(job, new Path("myPath"); > > And I obtain the next output (this is a file: part-r-00000): > > K CustomObject@2b237512 > K CustomObject@24db06de > ... > > When this job finished I run other job which input is SequenceFileInputFormat but It doesn't run: > > The configuration's second job is: > > job.setInputFormatClass(SequenceFileInputFormat.class); > SequenceFileInputFormat.addInputPath(job, new Path("myPath")); > > But I get an error: > > java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000 not a SequenceFile > at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523) > at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483) > at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451) > at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432) > at org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60) > > > Can someone help me? Because I don't understand it. I don't know to save my object in first M/R and how to use it in second M/R > > Thanks > > Joan > > >
-
Re: how to write custom object using M/RJoan 2011-01-19, 14:36
Hi Lance,
My custom object has Writable implement but I don't overrride toString method? *public class MyWritable implements DBWritable, Writable, Cloneable { int id; String str; @Override public void readFields(ResultSet rs) throws SQLException { id = rs.getInt(1); str = rs.getString(2); } @Override public void write(PreparedStatement pstmt) throws SQLException { // do nothing } @Override public void readFields(DataInput in) throws IOException { id = in.readInt(); str = Text.readString(in); } @Override public void write(DataOutput out) throws IOException { out.writeInt(id); Text.writeString(out, str); } }* But I don't understand why not serialize object, Thanks Joan 2011/1/17 Lance Norskog <[EMAIL PROTECTED]> > Does you custom object have Writable implemented? Also, does it have > toString() implemented? I think this means the Writable code does not > work: > > K CustomObject@2b237512 > K CustomObject@24db06de > > This is Java's default toString() method. > > On Mon, Jan 17, 2011 at 12:19 AM, Joan <[EMAIL PROTECTED]> wrote: > > Hi Alain, > > > > I put it, but It didn't work. > > > > Joan > > > > 2011/1/14 MONTMORY Alain <[EMAIL PROTECTED]> > >> > >> Hi, > >> > >> > >> > >> I think you have to put : > >> > >> job.setOutputFormatClass(SequenceFileOutputFormat.class); > >> > >> to make it works.. > >> > >> hopes this help > >> > >> > >> > >> Alain > >> > >> > >> > >> [@@THALES GROUP RESTRICTED@@] > >> > >> > >> > >> De : Joan [mailto:[EMAIL PROTECTED]] > >> Envoyé : vendredi 14 janvier 2011 13:58 > >> À : mapreduce-user > >> Objet : how to write custom object using M/R > >> > >> > >> > >> Hi, > >> > >> I'm trying to write (K,V) where K is a Text object and V's CustomObject. > >> But It doesn't run. > >> > >> I'm configuring output job like: SequenceFileInputFormat so I have job > >> with: > >> > >> job.setMapOutputKeyClass(Text.class); > >> job.setMapOutputValueClass(CustomObject.class); > >> job.setOutputKeyClass(Text.class); > >> job.setOutputValueClass(CustomObject.class); > >> > >> SequenceFileOutputFormat.setOutputPath(job, new Path("myPath"); > >> > >> And I obtain the next output (this is a file: part-r-00000): > >> > >> K CustomObject@2b237512 > >> K CustomObject@24db06de > >> ... > >> > >> When this job finished I run other job which input is > >> SequenceFileInputFormat but It doesn't run: > >> > >> The configuration's second job is: > >> > >> job.setInputFormatClass(SequenceFileInputFormat.class); > >> SequenceFileInputFormat.addInputPath(job, new Path("myPath")); > >> > >> But I get an error: > >> > >> java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000 > >> not a SequenceFile > >> at > >> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523) > >> at > >> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483) > >> at > >> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451) > >> at > >> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432) > >> at > >> > org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60) > >> > >> > >> Can someone help me? Because I don't understand it. I don't know to save > >> my object in first M/R and how to use it in second M/R > >> > >> Thanks > >> > >> Joan > >> > >> > > > > > > > > -- > Lance Norskog > [EMAIL PROTECTED] >
-
Re: how to write custom object using M/RJoan 2011-01-19, 14:42
2011/1/18 David Rosenstrauch <[EMAIL PROTECTED]>
> I assumed you were already doing this but yes, Alain is correct, you need > to set the output format too. > > I initialize writing to sequence files like so: > > job.setOutputFormatClass(SequenceFileOutputFormat.class); > FileOutputFormat.setOutputName(job, dataSourceName); > FileOutputFormat.setOutputPath(job, hdfsJobOutputPath); > FileOutputFormat.setCompressOutput(job, true); > FileOutputFormat.setOutputCompressorClass(job, DefaultCodec.class); > SequenceFileOutputFormat.setOutputCompressionType(job, > SequenceFile.CompressionType.BLOCK); > > DR > > > > On 01/14/2011 01:27 PM, MONTMORY Alain wrote: > >> Hi, >> >> I think you have to put : >> job.setOutputFormatClass(SequenceFileOutputFormat.class); >> to make it works.. >> hopes this help >> >> Alain >> >> [@@THALES GROUP RESTRICTED@@] >> >> De : Joan [mailto:[EMAIL PROTECTED]] >> Envoyé : vendredi 14 janvier 2011 13:58 >> À : mapreduce-user >> Objet : how to write custom object using M/R >> >> Hi, >> >> I'm trying to write (K,V) where K is a Text object and V's CustomObject. >> But It doesn't run. >> >> I'm configuring output job like: SequenceFileInputFormat so I have job >> with: >> >> job.setMapOutputKeyClass(Text.class); >> job.setMapOutputValueClass(CustomObject.class); >> job.setOutputKeyClass(Text.class); >> job.setOutputValueClass(CustomObject.class); >> >> SequenceFileOutputFormat.setOutputPath(job, new Path("myPath"); >> >> And I obtain the next output (this is a file: part-r-00000): >> >> K CustomObject@2b237512 >> K CustomObject@24db06de >> ... >> >> When this job finished I run other job which input is >> SequenceFileInputFormat but It doesn't run: >> >> The configuration's second job is: >> >> job.setInputFormatClass(SequenceFileInputFormat.class); >> SequenceFileInputFormat.addInputPath(job, new Path("myPath")); >> >> But I get an error: >> >> java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000 >> not a SequenceFile >> at >> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523) >> at >> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483) >> at >> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451) >> at >> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432) >> at >> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60) >> >> >> Can someone help me? Because I don't understand it. I don't know to save >> my object in first M/R and how to use it in second M/R >> >> Thanks >> >> Joan >> >> >> >> >
-
Re: how to write custom object using M/RJoan 2011-01-19, 14:50
Hi,
I tried but it didnt work. I don't understand why not it works, I only want that the first reducer write my object into DHFS and the second mapper reads this object from DHFS. I'm try to write object with SequenceFileOutFormat and I've have my own Writable, obviously my object implements Writable, but I continues doesn't work, and I also put job.setOutputFormatClass(SequenceFileOutputFormat.class) and SequenceFileOutputFormat.setOutputPath(conf, outputDir). However, I'm not using "setOutputCompression". Joan 2011/1/18 David Rosenstrauch <[EMAIL PROTECTED]> > I assumed you were already doing this but yes, Alain is correct, you need > to set the output format too. > > I initialize writing to sequence files like so: > > job.setOutputFormatClass(SequenceFileOutputFormat.class); > FileOutputFormat.setOutputName(job, dataSourceName); > FileOutputFormat.setOutputPath(job, hdfsJobOutputPath); > FileOutputFormat.setCompressOutput(job, true); > FileOutputFormat.setOutputCompressorClass(job, DefaultCodec.class); > SequenceFileOutputFormat.setOutputCompressionType(job, > SequenceFile.CompressionType.BLOCK); > > DR > > > > On 01/14/2011 01:27 PM, MONTMORY Alain wrote: > >> Hi, >> >> I think you have to put : >> job.setOutputFormatClass(SequenceFileOutputFormat.class); >> to make it works.. >> hopes this help >> >> Alain >> >> [@@THALES GROUP RESTRICTED@@] >> >> De : Joan [mailto:[EMAIL PROTECTED]] >> Envoyé : vendredi 14 janvier 2011 13:58 >> À : mapreduce-user >> Objet : how to write custom object using M/R >> >> Hi, >> >> I'm trying to write (K,V) where K is a Text object and V's CustomObject. >> But It doesn't run. >> >> I'm configuring output job like: SequenceFileInputFormat so I have job >> with: >> >> job.setMapOutputKeyClass(Text.class); >> job.setMapOutputValueClass(CustomObject.class); >> job.setOutputKeyClass(Text.class); >> job.setOutputValueClass(CustomObject.class); >> >> SequenceFileOutputFormat.setOutputPath(job, new Path("myPath"); >> >> And I obtain the next output (this is a file: part-r-00000): >> >> K CustomObject@2b237512 >> K CustomObject@24db06de >> ... >> >> When this job finished I run other job which input is >> SequenceFileInputFormat but It doesn't run: >> >> The configuration's second job is: >> >> job.setInputFormatClass(SequenceFileInputFormat.class); >> SequenceFileInputFormat.addInputPath(job, new Path("myPath")); >> >> But I get an error: >> >> java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000 >> not a SequenceFile >> at >> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523) >> at >> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483) >> at >> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451) >> at >> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432) >> at >> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60) >> >> >> Can someone help me? Because I don't understand it. I don't know to save >> my object in first M/R and how to use it in second M/R >> >> Thanks >> >> Joan >> >> >> >> >
-
Re: how to write custom object using M/RDavid Rosenstrauch 2011-01-19, 20:04
Maybe change "id" to be an IntWritable, and "str" to be a Text?
HTH, DR On 01/19/2011 09:36 AM, Joan wrote: > Hi Lance, > > My custom object has Writable implement but I don't overrride toString > method? > > *public class MyWritable implements DBWritable, Writable, Cloneable { > > int id; > String str; > > @Override > public void readFields(ResultSet rs) throws SQLException { > > id = rs.getInt(1); > str = rs.getString(2); > } > > @Override > public void write(PreparedStatement pstmt) throws SQLException { > // do nothing > } > > @Override > public void readFields(DataInput in) throws IOException { > id = in.readInt(); > str = Text.readString(in); > } > > @Override > public void write(DataOutput out) throws IOException { > > out.writeInt(id); > Text.writeString(out, str); > } > }* > > But I don't understand why not serialize object, > > Thanks > > Joan > > > > 2011/1/17 Lance Norskog<[EMAIL PROTECTED]> > >> Does you custom object have Writable implemented? Also, does it have >> toString() implemented? I think this means the Writable code does not >> work: >> >> K CustomObject@2b237512 >> K CustomObject@24db06de >> >> This is Java's default toString() method. >> >> On Mon, Jan 17, 2011 at 12:19 AM, Joan<[EMAIL PROTECTED]> wrote: >>> Hi Alain, >>> >>> I put it, but It didn't work. >>> >>> Joan >>> >>> 2011/1/14 MONTMORY Alain<[EMAIL PROTECTED]> >>>> >>>> Hi, >>>> >>>> >>>> >>>> I think you have to put : >>>> >>>> job.setOutputFormatClass(SequenceFileOutputFormat.class); >>>> >>>> to make it works.. >>>> >>>> hopes this help >>>> >>>> >>>> >>>> Alain >>>> >>>> >>>> >>>> [@@THALES GROUP RESTRICTED@@] >>>> >>>> >>>> >>>> De : Joan [mailto:[EMAIL PROTECTED]] >>>> Envoyé : vendredi 14 janvier 2011 13:58 >>>> À : mapreduce-user >>>> Objet : how to write custom object using M/R >>>> >>>> >>>> >>>> Hi, >>>> >>>> I'm trying to write (K,V) where K is a Text object and V's CustomObject. >>>> But It doesn't run. >>>> >>>> I'm configuring output job like: SequenceFileInputFormat so I have job >>>> with: >>>> >>>> job.setMapOutputKeyClass(Text.class); >>>> job.setMapOutputValueClass(CustomObject.class); >>>> job.setOutputKeyClass(Text.class); >>>> job.setOutputValueClass(CustomObject.class); >>>> >>>> SequenceFileOutputFormat.setOutputPath(job, new Path("myPath"); >>>> >>>> And I obtain the next output (this is a file: part-r-00000): >>>> >>>> K CustomObject@2b237512 >>>> K CustomObject@24db06de >>>> ... >>>> >>>> When this job finished I run other job which input is >>>> SequenceFileInputFormat but It doesn't run: >>>> >>>> The configuration's second job is: >>>> >>>> job.setInputFormatClass(SequenceFileInputFormat.class); >>>> SequenceFileInputFormat.addInputPath(job, new Path("myPath")); >>>> >>>> But I get an error: >>>> >>>> java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000 >>>> not a SequenceFile >>>> at >>>> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523) >>>> at >>>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483) >>>> at >>>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451) >>>> at >>>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432) >>>> at >>>> >> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60) >>>> >>>> >>>> Can someone help me? Because I don't understand it. I don't know to save >>>> my object in first M/R and how to use it in second M/R >>>> >>>> Thanks >>>> >>>> Joan >>>> >>>> >>> >>> >> >> >> >> -- >> Lance Norskog >> [EMAIL PROTECTED] >> > |