|
|
-
Secondary Sort example error
Ravi Chandran 2013-02-07, 18:25
Hi,
I am trying to do a name sorting using secondary sort. I have a working example, which I am taking as a reference. But I am getting a null pointer error in the MapTask class. I am not able to locate the reason. as the logic to create the Custom Object from a given file has been tested through a java class.. I am getting this error:
13/02/07 12:23:42 WARN snappy.LoadSnappy: Snappy native library is available 13/02/07 12:23:42 INFO snappy.LoadSnappy: Snappy native library loaded 13/02/07 12:23:42 INFO mapred.FileInputFormat: Total input paths to process : 1 13/02/07 12:23:43 INFO mapred.JobClient: Running job: job_201301301056_0014 13/02/07 12:23:44 INFO mapred.JobClient: map 0% reduce 0% 13/02/07 12:23:56 INFO mapred.JobClient: Task Id : attempt_201301301056_0014_m_000000_0, Status : FAILED java.lang.NullPointerException at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:814) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:385) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327) at org.apache.hadoop.mapred.Child$4.run(Child.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) at org.apache.hadoop.mapred.Child.main(Child.java:262) 13/02/07 12:23:57 INFO mapred.JobClient: Task Id : attempt_201301301056_0014_m_000001_0, Status : FAILED java.lang.NullPointerException at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:814) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:385) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327) at org.apache.hadoop.mapred.Child$4.run(Child.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) at org.apache.hadoop.mapred.Child.main(Child.java:262)
I am giving the Mapper code below:
import java.io.IOException; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapred.MapReduceBase; import org.apache.hadoop.mapred.Mapper; import org.apache.hadoop.mapred.OutputCollector; import org.apache.hadoop.mapred.Reporter; import org.apache.log4j.Logger; import com.pom.Name;
public class StubMapper extends MapReduceBase implements Mapper<LongWritable, Text, Name, Text> {
private static Logger logger Logger.getLogger(StubMapper.class.getName());
StringBuffer readLine = new StringBuffer(); private final Name name = new Name(); @Override public void map(LongWritable key, Text value, OutputCollector<Name, Text> output, Reporter reporter) throws IOException { String line = value.toString(); String[] packer = null;
packer = line.split(" ");
// create the object if(packer.length>2) { // take everything except last name for (int i = 0; i < packer.length-1; i++) { readLine.append(packer[i]+" "); }
name.setfName(readLine.toString()); name.setlName(packer[packer.length-1]);
//clear the variable readLine.delete(0, readLine.length()); } else if(packer.length>0) { name.setfName(packer[0]); name.setlName(packer[1]); }
output.collect(name, new Text(name.getlName()));
} }
I am not able to figure out the possible cause..
-- Thanks & Regards Ravi
-
Re: Secondary Sort example error
Harsh J 2013-02-07, 18:34
Hey Ravi,
What version of Hadoop is this exactly? (Type and send output of "hadoop version" if unsure)
On Thu, Feb 7, 2013 at 11:55 PM, Ravi Chandran <[EMAIL PROTECTED]> wrote: > Hi, > > I am trying to do a name sorting using secondary sort. I have a working > example, which I am taking as a reference. But I am getting a null pointer > error in the MapTask class. I am not able to locate the reason. as the logic > to create the Custom Object from a given file has been tested through a java > class.. > I am getting this error: > > 13/02/07 12:23:42 WARN snappy.LoadSnappy: Snappy native library is available > 13/02/07 12:23:42 INFO snappy.LoadSnappy: Snappy native library loaded > 13/02/07 12:23:42 INFO mapred.FileInputFormat: Total input paths to process > : 1 > 13/02/07 12:23:43 INFO mapred.JobClient: Running job: job_201301301056_0014 > 13/02/07 12:23:44 INFO mapred.JobClient: map 0% reduce 0% > 13/02/07 12:23:56 INFO mapred.JobClient: Task Id : > attempt_201301301056_0014_m_000000_0, Status : FAILED > java.lang.NullPointerException > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:814) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:385) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327) > at org.apache.hadoop.mapred.Child$4.run(Child.java:268) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) > at org.apache.hadoop.mapred.Child.main(Child.java:262) > 13/02/07 12:23:57 INFO mapred.JobClient: Task Id : > attempt_201301301056_0014_m_000001_0, Status : FAILED > java.lang.NullPointerException > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:814) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:385) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327) > at org.apache.hadoop.mapred.Child$4.run(Child.java:268) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) > at org.apache.hadoop.mapred.Child.main(Child.java:262) > > I am giving the Mapper code below: > > import java.io.IOException; > import org.apache.hadoop.io.LongWritable; > import org.apache.hadoop.io.Text; > import org.apache.hadoop.mapred.MapReduceBase; > import org.apache.hadoop.mapred.Mapper; > import org.apache.hadoop.mapred.OutputCollector; > import org.apache.hadoop.mapred.Reporter; > import org.apache.log4j.Logger; > import com.pom.Name; > > public class StubMapper extends MapReduceBase implements > Mapper<LongWritable, Text, Name, Text> { > > private static Logger logger > Logger.getLogger(StubMapper.class.getName()); > > StringBuffer readLine = new StringBuffer(); > private final Name name = new Name(); > @Override > public void map(LongWritable key, Text value, > OutputCollector<Name, Text> output, Reporter reporter) > throws IOException { > String line = value.toString(); > String[] packer = null; > > packer = line.split(" "); > > // create the object > if(packer.length>2) > { > // take everything except last name > for (int i = 0; i < packer.length-1; i++) { > readLine.append(packer[i]+" "); > } > > name.setfName(readLine.toString()); > name.setlName(packer[packer.length-1]); > > //clear the variable > readLine.delete(0, readLine.length()); > } > else if(packer.length>0) > { > name.setfName(packer[0]); > name.setlName(packer[1]); > } > > output.collect(name, new Text(name.getlName())); > > } > } > > I am not able to figure out the possible cause.. > > -- > Thanks & Regards > Ravi
-- Harsh J
-
Re: Secondary Sort example error
Harsh J 2013-02-07, 18:46
Thanks, I managed to correlate proper line numbers.
Are you using some form of custom serialization in your job code? That is, are your keys non-Writable types and are of some other type? The specific NPE is arising from the SerializationFactory not being able to find a serializer for your Map-Output key class. You may want to look into that direction, or share your code for the list to spot it instead.
On Fri, Feb 8, 2013 at 12:11 AM, Ravi Chandran <[EMAIL PROTECTED]> wrote: > hi, > > it is Hadoop 2.0.0-cdh4.1.1. the whole output is given below: > > Hadoop 2.0.0-cdh4.1.1 > Subversion > file:///data/1/jenkins/workspace/generic-package-centos32-6/topdir/BUILD/hadoop-2.0.0-cdh4.1.1/src/hadoop-common-project/hadoop-common > -r 581959ba23e4af85afd8db98b7687662fe9c5f20 > > > > On Fri, Feb 8, 2013 at 12:04 AM, Harsh J <[EMAIL PROTECTED]> wrote: >> >> Hey Ravi, >> >> What version of Hadoop is this exactly? (Type and send output of >> "hadoop version" if unsure) >> >> On Thu, Feb 7, 2013 at 11:55 PM, Ravi Chandran >> <[EMAIL PROTECTED]> wrote: >> > Hi, >> > >> > I am trying to do a name sorting using secondary sort. I have a working >> > example, which I am taking as a reference. But I am getting a null >> > pointer >> > error in the MapTask class. I am not able to locate the reason. as the >> > logic >> > to create the Custom Object from a given file has been tested through a >> > java >> > class.. >> > I am getting this error: >> > >> > 13/02/07 12:23:42 WARN snappy.LoadSnappy: Snappy native library is >> > available >> > 13/02/07 12:23:42 INFO snappy.LoadSnappy: Snappy native library loaded >> > 13/02/07 12:23:42 INFO mapred.FileInputFormat: Total input paths to >> > process >> > : 1 >> > 13/02/07 12:23:43 INFO mapred.JobClient: Running job: >> > job_201301301056_0014 >> > 13/02/07 12:23:44 INFO mapred.JobClient: map 0% reduce 0% >> > 13/02/07 12:23:56 INFO mapred.JobClient: Task Id : >> > attempt_201301301056_0014_m_000000_0, Status : FAILED >> > java.lang.NullPointerException >> > at >> > >> > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:814) >> > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:385) >> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327) >> > at org.apache.hadoop.mapred.Child$4.run(Child.java:268) >> > at java.security.AccessController.doPrivileged(Native Method) >> > at javax.security.auth.Subject.doAs(Subject.java:396) >> > at >> > >> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) >> > at org.apache.hadoop.mapred.Child.main(Child.java:262) >> > 13/02/07 12:23:57 INFO mapred.JobClient: Task Id : >> > attempt_201301301056_0014_m_000001_0, Status : FAILED >> > java.lang.NullPointerException >> > at >> > >> > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:814) >> > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:385) >> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327) >> > at org.apache.hadoop.mapred.Child$4.run(Child.java:268) >> > at java.security.AccessController.doPrivileged(Native Method) >> > at javax.security.auth.Subject.doAs(Subject.java:396) >> > at >> > >> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) >> > at org.apache.hadoop.mapred.Child.main(Child.java:262) >> > >> > I am giving the Mapper code below: >> > >> > import java.io.IOException; >> > import org.apache.hadoop.io.LongWritable; >> > import org.apache.hadoop.io.Text; >> > import org.apache.hadoop.mapred.MapReduceBase; >> > import org.apache.hadoop.mapred.Mapper; >> > import org.apache.hadoop.mapred.OutputCollector; >> > import org.apache.hadoop.mapred.Reporter; >> > import org.apache.log4j.Logger; >> > import com.pom.Name; >> > >> > public class StubMapper extends MapReduceBase implements >> > Mapper<LongWritable, Text, Name, Text> { >> > >> > private static Logger logger >> > Logger.getLogger(StubMapper.class.getName()); >> > >> Harsh J
-
Re: Secondary Sort example error
Ravi Chandran 2013-02-07, 19:00
Thanks for replying. . I believe writable and writablecomparable handle the serialization.. I found the required class
public class Name implements WritableComparable<Name>{
private String fName; private String lName; static { // register this comparator WritableComparator.define(Name.class, new NameSorterComparator()); } public Name() {
}
public Name(String first, String last) { set(first, last); }
public void set(String first, String last) { this.fName = first; this.lName = last; } public String getfName() { return fName; } public void setfName(String fName) { this.fName = fName; } public String getlName() { return lName; } public void setlName(String lName) { this.lName = lName; } public String toString() { return this.getfName()+" "+this.getlName(); }
@Override public void write(DataOutput out) throws IOException { // TODO Auto-generated method stub out.writeUTF(fName); out.writeUTF(lName); } public boolean equals(Name o) { Name other = o; if(this.fName.toString().equalsIgnoreCase(other.fName.toString())) { if(this.lName.toString().equalsIgnoreCase(other.lName.toString())) { return true; } } return false; } @Override public void readFields(DataInput in) throws IOException { // TODO Auto-generated method stub this.fName = in.readUTF(); this.lName = in.readUTF(); } @Override public int hashCode() { return fName.hashCode() * 514 + lName.hashCode(); } @Override public int compareTo(Name tp) { int cmp = fName.compareTo(tp.fName); if (cmp != 0) { return cmp; } return lName.compareTo(tp.lName); } } On Fri, Feb 8, 2013 at 12:16 AM, Harsh J <[EMAIL PROTECTED]> wrote:
> Thanks, I managed to correlate proper line numbers. > > Are you using some form of custom serialization in your job code? That > is, are your keys non-Writable types and are of some other type? The > specific NPE is arising from the SerializationFactory not being able > to find a serializer for your Map-Output key class. You may want to > look into that direction, or share your code for the list to spot it > instead. > > On Fri, Feb 8, 2013 at 12:11 AM, Ravi Chandran > <[EMAIL PROTECTED]> wrote: > > hi, > > > > it is Hadoop 2.0.0-cdh4.1.1. the whole output is given below: > > > > Hadoop 2.0.0-cdh4.1.1 > > Subversion > > > file:///data/1/jenkins/workspace/generic-package-centos32-6/topdir/BUILD/hadoop-2.0.0-cdh4.1.1/src/hadoop-common-project/hadoop-common > > -r 581959ba23e4af85afd8db98b7687662fe9c5f20 > > > > > > > > On Fri, Feb 8, 2013 at 12:04 AM, Harsh J <[EMAIL PROTECTED]> wrote: > >> > >> Hey Ravi, > >> > >> What version of Hadoop is this exactly? (Type and send output of > >> "hadoop version" if unsure) > >> > >> On Thu, Feb 7, 2013 at 11:55 PM, Ravi Chandran > >> <[EMAIL PROTECTED]> wrote: > >> > Hi, > >> > > >> > I am trying to do a name sorting using secondary sort. I have a > working > >> > example, which I am taking as a reference. But I am getting a null > >> > pointer > >> > error in the MapTask class. I am not able to locate the reason. as the > >> > logic > >> > to create the Custom Object from a given file has been tested through > a > >> > java > >> > class.. > >> > I am getting this error: > >> > > >> > 13/02/07 12:23:42 WARN snappy.LoadSnappy: Snappy native library is > >> > available > >> > 13/02/07 12:23:42 INFO snappy.LoadSnappy: Snappy native library loaded > >> > 13/02/07 12:23:42 INFO mapred.FileInputFormat: Total input paths to > >> > process > >> > : 1 > >> > 13/02/07 12:23:43 INFO mapred.JobClient: Running job: > >> > job_201301301056_0014 > >> > 13/02/07 12:23:44 INFO mapred.JobClient: map 0% reduce 0% > >> > 13/02/07 12:23:56 INFO mapred.JobClient: Task Id : > >> > attempt_201301301056_0014_m_000000_0, Status : FAILED > >> > java.lang.NullPointerException > >> > at > >> > > >> > > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:814) > >> > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:385)
Thanks & Regards Ravi
-
Re: Secondary Sort example error
Ravi Chandran 2013-02-08, 17:04
Hi,
I am still not able to find the fix for this issue. have tried to change the attribute datatypes from String to Text.. But I still have the same error.. Is there anything else that I can try?
Thanks On Fri, Feb 8, 2013 at 12:30 AM, Ravi Chandran <[EMAIL PROTECTED]>wrote:
> Thanks for replying. . I believe writable and writablecomparable handle > the serialization.. I found the required class > > public class Name implements WritableComparable<Name>{ > > private String fName; > private String lName; > static { // register this comparator > WritableComparator.define(Name.class, new NameSorterComparator()); > } > > > public Name() > { > > } > > public Name(String first, String last) > { > set(first, last); > } > > public void set(String first, String last) { > this.fName = first; > this.lName = last; > } > public String getfName() { > return fName; > } > public void setfName(String fName) { > this.fName = fName; > } > public String getlName() { > return lName; > } > public void setlName(String lName) { > this.lName = lName; > } > public String toString() > { > return this.getfName()+" "+this.getlName(); > } > > @Override > public void write(DataOutput out) throws IOException { > // TODO Auto-generated method stub > out.writeUTF(fName); > out.writeUTF(lName); > } > public boolean equals(Name o) { > Name other = o; > if(this.fName.toString().equalsIgnoreCase(other.fName.toString())) > { > if(this.lName.toString().equalsIgnoreCase(other.lName.toString())) > { > return true; > } > } > return false; > } > @Override > public void readFields(DataInput in) throws IOException { > // TODO Auto-generated method stub > this.fName = in.readUTF(); > this.lName = in.readUTF(); > } > @Override > public int hashCode() { > return fName.hashCode() * 514 + lName.hashCode(); > } > @Override > public int compareTo(Name tp) { > int cmp = fName.compareTo(tp.fName); > if (cmp != 0) { > return cmp; > } > return lName.compareTo(tp.lName); > } > } > > > On Fri, Feb 8, 2013 at 12:16 AM, Harsh J <[EMAIL PROTECTED]> wrote: > >> Thanks, I managed to correlate proper line numbers. >> >> Are you using some form of custom serialization in your job code? That >> is, are your keys non-Writable types and are of some other type? The >> specific NPE is arising from the SerializationFactory not being able >> to find a serializer for your Map-Output key class. You may want to >> look into that direction, or share your code for the list to spot it >> instead. >> >> On Fri, Feb 8, 2013 at 12:11 AM, Ravi Chandran >> <[EMAIL PROTECTED]> wrote: >> > hi, >> > >> > it is Hadoop 2.0.0-cdh4.1.1. the whole output is given below: >> > >> > Hadoop 2.0.0-cdh4.1.1 >> > Subversion >> > >> file:///data/1/jenkins/workspace/generic-package-centos32-6/topdir/BUILD/hadoop-2.0.0-cdh4.1.1/src/hadoop-common-project/hadoop-common >> > -r 581959ba23e4af85afd8db98b7687662fe9c5f20 >> > >> > >> > >> > On Fri, Feb 8, 2013 at 12:04 AM, Harsh J <[EMAIL PROTECTED]> wrote: >> >> >> >> Hey Ravi, >> >> >> >> What version of Hadoop is this exactly? (Type and send output of >> >> "hadoop version" if unsure) >> >> >> >> On Thu, Feb 7, 2013 at 11:55 PM, Ravi Chandran >> >> <[EMAIL PROTECTED]> wrote: >> >> > Hi, >> >> > >> >> > I am trying to do a name sorting using secondary sort. I have a >> working >> >> > example, which I am taking as a reference. But I am getting a null >> >> > pointer >> >> > error in the MapTask class. I am not able to locate the reason. as >> the >> >> > logic >> >> > to create the Custom Object from a given file has been tested >> through a >> >> > java >> >> > class.. >> >> > I am getting this error: >> >> > >> >> > 13/02/07 12:23:42 WARN snappy.LoadSnappy: Snappy native library is >> >> > available >> >> > 13/02/07 12:23:42 INFO snappy.LoadSnappy: Snappy native library >> loaded >> >> > 13/02/07 12:23:42 INFO mapred.FileInputFormat: Total input paths to
Thanks & Regards Ravi
|
|