Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> AvroMultipleOutput ignores schemas (other than default)


Copy link to this message
-
AvroMultipleOutput ignores schemas (other than default)
Hi!

Maybe I'm the only one ever used this :D.

Adding namedOutputs with AvroMultipleOutputs.addNamedOutput just adds them
to a static map which is of course not available on the cluster during
reduce execution. The unit tests pass though since the Instance of
AvroMultipleOutputs is the same in the Reducer as in the Job's main class,
so the added schemas there are present.
Fix would be to add the namedOutput schemas to the job configuration so
they can be parsed in the reducers. Example patch for the new mapreduce api
here:

https://gist.github.com/4677875

Have a nice evening,

Johannes
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB