|
|
-
Re: Union of Records IssueDoug Cutting 2012-01-10, 21:41
Hans,
It's hard for me to guess what's going on, whether you've found a bug in Avro or whether you have a bug in your code. Can you post a complete, runnable example of the problem, perhaps as a bug report in Jira? Thanks, Doug On 01/10/2012 11:21 AM, Uhlig, Hans wrote: > I am creating a dynamic union of records as seen below but keep > receiving an exception org.apache.avro.UnresolvedUnionException: Not in > union > > Any reason why it deems the same schemas that created the union invalid > for collection? Avro throws this with each record it tries to collect. > An example of this working would be appreciated. > > > > Also, is there such a thing as a nullrecord, The records I am assembling > fit into a set instead of a Map but I could find no elegent way outside > of defining a record with a single field of null. > > inside ToolRunnner > Schema.Parser p = new Schema.Parser(); > > ArrayList<Schema> keySchemas = new ArrayList<Schema>(); > keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s1.avsc"))); > > keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s2.avsc"))); > > keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s3.avsc"))); > > keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s4.avsc"))); > > keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s5.avsc"))); > > > Schema keySchema = Schema.createUnion(keySchemas); > Schema valSchema > p.parse(AvroConverter.class.getResourceAsStream("null.avsc")); > AvroJob.setMapOutputSchema(conf, Pair.getPairSchema(keySchema, valSchema)); > > Inside Mapper Setup: > private static HashMap<String, Schema> keySchemas = new HashMap<String, > Schema>(); > private static Schema valSchema; > Schema.Parser p = new Schema.Parser(); > keySchemas.put("s1", p.parse(Map.class.getResourceAsStream("s1.avsc"))); > keySchemas.put("s2", p.parse(Map.class.getResourceAsStream("s2.avsc"))); > keySchemas.put("s3", p.parse(Map.class.getResourceAsStream("s3.avsc"))); > keySchemas.put("s4", p.parse(Map.class.getResourceAsStream("s4.avsc"))); > keySchemas.put("s5", p.parse(Map.class.getResourceAsStream("s5.avsc"))); > valSchema = p.parse(Map.class.getResourceAsStream("null.avsc")); > > Inside Map function: > GenericData.Record r; > if(in.type=="s1") { > r = new GenericData.Record(keySchemas.get("s1"); > } else if(in.type=="s1") { > r = new GenericData.Record(keySchemas.get("s2"); > } > oc.collect(new AvroKey<GenericRecord>(r), new > AvroValue<GenericRecord>(new GenericData.Record(valSchema))); > > Avro throws a Union Exception everytime I pass in a record. Any reason > why it deems the same schemas that created the union invalid for > collection? > > org.apache.avro.UnresolvedUnionException: Not in unionI am creating a > dynamic union of records as seen below > > inside ToolRunnner > Schema.Parser p = new Schema.Parser(); > > ArrayList<Schema> keySchemas = new ArrayList<Schema>(); > keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s1.avsc"))); > > keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s2.avsc"))); > > keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s3.avsc"))); > > keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s4.avsc"))); > > keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s5.avsc"))); > > > Schema keySchema = Schema.createUnion(keySchemas); > Schema valSchema > p.parse(AvroConverter.class.getResourceAsStream("null.avsc")); > AvroJob.setMapOutputSchema(conf, Pair.getPairSchema(keySchema, valSchema)); > > Inside Mapper Setup: > private static HashMap<String, Schema> keySchemas = new HashMap<String, > Schema>(); > private static Schema valSchema; > Schema.Parser p = new Schema.Parser(); > keySchemas.put("s1", p.parse(Map.class.getResourceAsStream("s1.avsc"))); > keySchemas.put("s2", p.parse(Map.class.getResourceAsStream("s2.avsc"))); > keySchemas.put("s3", p.parse(Map.class.getResourceAsStream("s3.avsc"))); |