Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Union of Records Issue


Copy link to this message
-
Re: Union of Records Issue
Can it be related to https://issues.apache.org/jira/browse/AVRO-966 ?
Does the patch help?

Vyacheslav

On Jan 10, 2012, at 10:21 PM, Uhlig, Hans wrote:

> I am creating a dynamic union of records as seen below but keep receiving an exception org.apache.avro.UnresolvedUnionException: Not in union
> Any reason why it deems the same schemas that created the union invalid for collection? Avro throws this with each record it tries to collect. An example of this working would be appreciated.
>  
> Also, is there such a thing as a nullrecord, The records I am assembling fit into a set instead of a Map but I could find no elegent way outside of defining a record with a single field of null.
>
> inside ToolRunnner
> Schema.Parser p = new Schema.Parser();
>        
> ArrayList<Schema> keySchemas = new ArrayList<Schema>();
> keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s1.avsc")));
> keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s2.avsc")));
> keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s3.avsc")));
> keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s4.avsc")));
> keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s5.avsc")));
>        
> Schema keySchema = Schema.createUnion(keySchemas);
> Schema valSchema = p.parse(AvroConverter.class.getResourceAsStream("null.avsc"));
> AvroJob.setMapOutputSchema(conf, Pair.getPairSchema(keySchema, valSchema));
>
> Inside Mapper Setup:
> private static HashMap<String, Schema> keySchemas = new HashMap<String, Schema>();
> private static Schema valSchema;
> Schema.Parser p = new Schema.Parser();
> keySchemas.put("s1", p.parse(Map.class.getResourceAsStream("s1.avsc")));
> keySchemas.put("s2", p.parse(Map.class.getResourceAsStream("s2.avsc")));
> keySchemas.put("s3", p.parse(Map.class.getResourceAsStream("s3.avsc")));
> keySchemas.put("s4", p.parse(Map.class.getResourceAsStream("s4.avsc")));
> keySchemas.put("s5", p.parse(Map.class.getResourceAsStream("s5.avsc")));
> valSchema = p.parse(Map.class.getResourceAsStream("null.avsc"));
>
> Inside Map function:
> GenericData.Record r;
> if(in.type=="s1") {
> r = new GenericData.Record(keySchemas.get("s1");
> } else if(in.type=="s1") {
> r = new GenericData.Record(keySchemas.get("s2");
> }
> oc.collect(new AvroKey<GenericRecord>(r), new AvroValue<GenericRecord>(new GenericData.Record(valSchema)));
>
> Avro throws a Union Exception everytime I pass in a record. Any reason why it deems the same schemas that created the union invalid for collection?
>
> org.apache.avro.UnresolvedUnionException: Not in unionI am creating a dynamic union of records as seen below
>
> inside ToolRunnner
> Schema.Parser p = new Schema.Parser();
>        
> ArrayList<Schema> keySchemas = new ArrayList<Schema>();
> keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s1.avsc")));
> keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s2.avsc")));
> keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s3.avsc")));
> keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s4.avsc")));
> keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s5.avsc")));
>        
> Schema keySchema = Schema.createUnion(keySchemas);
> Schema valSchema = p.parse(AvroConverter.class.getResourceAsStream("null.avsc"));
> AvroJob.setMapOutputSchema(conf, Pair.getPairSchema(keySchema, valSchema));
>
> Inside Mapper Setup:
> private static HashMap<String, Schema> keySchemas = new HashMap<String, Schema>();
> private static Schema valSchema;
> Schema.Parser p = new Schema.Parser();
> keySchemas.put("s1", p.parse(Map.class.getResourceAsStream("s1.avsc")));
> keySchemas.put("s2", p.parse(Map.class.getResourceAsStream("s2.avsc")));
> keySchemas.put("s3", p.parse(Map.class.getResourceAsStream("s3.avsc")));
> keySchemas.put("s4", p.parse(Map.class.getResourceAsStream("s4.avsc")));
> keySchemas.put("s5", p.parse(Map.class.getResourceAsStream("s5.avsc")));

Best,
Vyacheslav