Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> NullPointerException when trying to write mapper output


Copy link to this message
-
NullPointerException when trying to write mapper output
I am using hadoop 1.0.3 at Amazon EMR. I have a map / reduce job configured
like this:

private static final String TEMP_PATH_PREFIX System.getProperty("java.io.tmpdir") + "/dmp_processor_tmp";
...
private Job setupProcessorJobS3() throws IOException, DataGrinderException {
String inputFiles = System.getProperty(DGConfig.INPUT_FILES);
Job processorJob = new Job(getConf(), PROCESSOR_JOBNAME);
processorJob.setJarByClass(DgRunner.class);
processorJob.setMapperClass(EntityMapperS3.class);
processorJob.setReducerClass(SelectorReducer.class);
processorJob.setOutputKeyClass(Text.class);
processorJob.setOutputValueClass(Text.class);
FileOutputFormat.setOutputPath(processorJob, new Path(TEMP_PATH_PREFIX));
processorJob.setOutputFormatClass(TextOutputFormat.class);
 processorJob.setInputFormatClass(NLineInputFormat.class);
FileInputFormat.setInputPaths(processorJob, inputFiles);
NLineInputFormat.setNumLinesPerSplit(processorJob, 10000);
 return processorJob;
}

In my mapper class, I have:

private Text outkey = new Text();
private Text outvalue = new Text();
...
outkey.set(entity.getEntityId().toString());
outvalue.set(input.getId().toString());
printLog("context write");
context.write(outkey, outvalue);

This last line (`context.write(outkey, outvalue);`), causes this exception.
Of course both `outkey` and `outvalue` are not null.

    2013-10-24 05:48:48,422 INFO
com.s1mbi0se.grinder.core.mapred.EntityMapperCassandra (main): Current
Thread: Thread[main,5,main]Current timestamp: 1382593728422 context write
    2013-10-24 05:48:48,422 ERROR
com.s1mbi0se.grinder.core.mapred.EntityMapperCassandra (main): Error on
entitymapper for input: 03a07858-4196-46dd-8a2c-23dd824d6e6e
    java.lang.NullPointerException
    at java.lang.System.arraycopy(Native Method)
    at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1293)
    at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1210)
    at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
    at org.apache.hadoop.io.WritableUtils.writeVLong(WritableUtils.java:264)
    at org.apache.hadoop.io.WritableUtils.writeVInt(WritableUtils.java:244)
    at org.apache.hadoop.io.Text.write(Text.java:281)
    at
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90)
    at
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77)
    at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1077)
    at
org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:698)
    at
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
    at
com.s1mbi0se.grinder.core.mapred.EntityMapper.map(EntityMapper.java:78)
    at
com.s1mbi0se.grinder.core.mapred.EntityMapperS3.map(EntityMapperS3.java:34)
    at
com.s1mbi0se.grinder.core.mapred.EntityMapperS3.map(EntityMapperS3.java:14)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:771)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:375)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
    2013-10-24 05:48:48,422 INFO
com.s1mbi0se.grinder.core.mapred.EntityMapperS3 (main): Current Thread:
Thread[main,5,main]Current timestamp: 1382593728422 Entity Mapper end

The first records on each task are just processed ok. In some point of the
task processing, I start to take this exception over and over, and then it
doesn't process a single record anymore for that task.

I tried to set `TEMP_PATH_PREFIX` to `"s3://mybucket/dmp_processor_tmp"`,
but same thing happened.

Any idea why is this happening? What could be making hadoop not being able
to write on it's output?
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB