Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro, mail # user - AvroMultipleOutput ignores schemas (other than default)


Copy link to this message
-
AvroMultipleOutput ignores schemas (other than default)
Johannes Schulte 2013-01-30, 22:32
Hi!

Maybe I'm the only one ever used this :D.

Adding namedOutputs with AvroMultipleOutputs.addNamedOutput just adds them
to a static map which is of course not available on the cluster during
reduce execution. The unit tests pass though since the Instance of
AvroMultipleOutputs is the same in the Reducer as in the Job's main class,
so the added schemas there are present.
Fix would be to add the namedOutput schemas to the job configuration so
they can be parsed in the reducers. Example patch for the new mapreduce api
here:

https://gist.github.com/4677875

Have a nice evening,

Johannes