Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # user >> Avro and Oozie Map Reduce action


+
M, Paul 2013-03-12, 22:35
+
Harsh J 2013-03-19, 05:26
Copy link to this message
-
Re: Avro and Oozie Map Reduce action
Thanks. That worked!

On Mar 18, 2013, at 10:26 PM, Harsh J <[EMAIL PROTECTED]>
 wrote:

> The value you're specifying for io.serializations below is incorrect:
>
> <property>
> <name>io.serializations</name>
> <value>org.apache.avro.mapred.AvroSerialization,
> avro.serialization.key.reader.schema,
> avro.serialization.value.reader.schema,
> avro.serialization.key.writer.schema,avro.serialization.value.writer.schema
> </value>
> </property>
>
> If the goal is to include org.apache.avro.mapred.AvroSerialization,
> then it should look more like:
>
> <property>
>  <name>io.serializations</name>
>  <value>org.apache.hadoop.io.serializer.WritableSerialization,org.apache.hadoop.io.serializer.avro.AvroSpecificSerialization,org.apache.hadoop.io.serializer.avro.AvroReflectSerialization,org.apache.avro.mapred.AvroSerialization</value>
> </property>
>
> That is, it must be an extension of the default values, and not a
> replacement of them.
>
> On Wed, Mar 13, 2013 at 4:05 AM, M, Paul <[EMAIL PROTECTED]> wrote:
>> Hello,
>>
>> I am trying to run an M/R job with Avro serialization via Oozie.  I've made
>> some progress in the workflow.xml, however I am still running into the
>> following error.  Any thoughts?  I believe it may have to do with the
>> io.serializations property below.   FYI, I am using CDH 4.2.0 mr1.
>>
>> 2013-03-12 15:24:32,334 INFO org.apache.hadoop.mapred.TaskInProgress: Error
>> from attempt_201303111118_0080_m_000000_3: java.lang.NullPointerException
>> at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:356)
>> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:389)
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
>> at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:396)
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1407)
>> at org.apache.hadoop.mapred.Child.main(Child.java:262)
>>
>>
>> <action name="mr-node">
>> <map-reduce>
>> <job-tracker>${jobTracker}</job-tracker>
>> <name-node>${nameNode}</name-node>
>> <prepare>
>> <delete path="${nameNode}/user/${wf:user()}/${outputDir}" />
>> </prepare>
>> <configuration>
>> <property>
>> <name>mapred.job.queue.name</name>
>> <value>${queueName}</value>
>> </property>
>>
>> <property>
>> <name>mapreduce.reduce.class</name>
>> <value>org.apache.avro.mapred.HadoopReducer</value>
>> </property>
>> <property>
>> <name>mapreduce.map.class</name>
>> <value>org.apache.avro.mapred.HadoopMapper</value>
>> </property>
>>
>>
>> <property>
>> <name>avro.reducer</name>
>> <value>org.my.project.mapreduce.CombineAvroRecordsByHourReducer
>> </value>
>> </property>
>>
>> <property>
>> <name>avro.mapper</name>
>> <value>org.my.project.mapreduce.ParseMetadataAsTextIntoAvroMapper
>> </value>
>> </property>
>>
>>
>> <property>
>> <name>mapreduce.inputformat.class</name>
>> <value>org.my.project.mapreduce.NonSplitableInputFormat</value>
>> </property>
>>
>> <!-- Key Value Mapper -->
>> <property>
>> <name>avro.output.schema</name>
>> <value>{"type":"record","name":"Pair","namespace":"org.apache.avro.mapred","fields":..."}]}
>> </value>
>> </property>
>> <property>
>> <name>mapred.mapoutput.key.class</name>
>> <value>org.apache.avro.mapred.AvroKey</value>
>> </property>
>> <property>
>> <name>mapred.mapoutput.value.class</name>
>> <value>org.apache.avro.mapred.AvroValue</value>
>> </property>
>>
>>
>> <property>
>> <name>avro.schema.output.key</name>
>> <value>{"type":"record","name":"DataRecord","namespace":...]}]}
>> </value>
>> </property>
>>
>> <property>
>> <name>mapreduce.outputformat.class</name>
>> <value>org.apache.hadoop.mapreduce.lib.output.TextOutputFormat
>> </value>
>> </property>
>>
>> <property>
>> <name>mapred.output.key.comparator.class</name>
>> <value>org.apache.avro.mapred.AvroKeyComparator</value>
>> </property>
>>
>> <property>
>> <name>io.serializations</name>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB