Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro, mail # user - Deserialize the attributes data using another schema give me wrong results


Copy link to this message
-
Deserialize the attributes data using another schema give me wrong results
Raihan Jamal 2013-09-26, 00:10
I am trying to serialize one of our Attributes Daya using Apache Avro
Schema. Here the attribute name is `e7` and the schema that I am using to
serialize it is `schema2.avsc` which is below.

    {
     "namespace": "com.avro.test.AvroExperiment",
     "type": "record",
     "name": "DEMOGRAPHIC",
     "doc": "DEMOGRAPHIC data",
        "fields": [
            {"name": "dob", "type": "string"},
            {"name": "gndr",  "type": "string"},
            {"name": "occupation", "type": "string"},
    {"name": "mrtlStatus", "type": "string"},
    {"name": "numChldrn", "type": "int"},
    {"name": "estInc", "type": "string"},
    {"name": "schemaId", "type": "int"},
    {"name": "lmd", "type": "long"}
        ]
    }

Below is the code that I am using to serialize the attribute `e7` using
above avro `schema2.avsc`. And I am able to serialize it properly and it
works fine...
 Schema schema = new
Parser().parse((AvroExperiment.class.getResourceAsStream("/schema2.avsc")));
 GenericRecord record = new GenericData.Record(schema);
record.put("dob", "161913600000");
 record.put("gndr", "f");
record.put("occupation", "doctor");
 record.put("mrtlStatus", "single");
record.put("numChldrn", 3);
record.put("estInc", "50000");
 record.put("schemaId", 20001);
record.put("lmd", 1379814280254L);

 GenericDatumWriter<GenericRecord> writer = new
GenericDatumWriter<GenericRecord>(schema);
ByteArrayOutputStream os = new ByteArrayOutputStream();

Encoder e = EncoderFactory.get().binaryEncoder(os, null);

writer.write(record, e);
 e.flush();
byte[] byteData = os.toByteArray();
os.close();

Now, I tried deserializing the same `e7` attributes data using the same
above avro schema definition `schema2.avsc` and it also works fine, and I
am able to deserialize it properly.
 GenericDatumReader<GenericRecord> r = new
GenericDatumReader<GenericRecord>(schema);
BinaryDecoder decoder = DecoderFactory.get().binaryDecoder(byteData, null);
 GenericRecord result = r.read(null, decoder);

System.out.println(result);
System.out.println(result.get("schemaId"));
 System.out.println(result.get("lmd"));
Now I thought, lets deserialize the same attributes data using another avro
schema that I have which is `schema1.avsc` and just extract only `schemaId`
and `lmd` from that. Below is the schema-

    {
     "namespace": "com.avro.test.AvroExperiment",
     "type": "record",
     "name": "DEMOGRAPHIC",
     "doc": "DEMOGRAPHIC data",
        "fields": [
    {"name": "schemaId", "type": "int"},
    {"name": "lmd", "type": "long"}
        ]
    }
 /**
 * Deserialize the same byte data using another Avro Schema
 */

Schema schema1 = new
Parser().parse((AvroExperiment.class.getResourceAsStream("/schema1.avsc")));

GenericDatumReader<GenericRecord> r1 = new
GenericDatumReader<GenericRecord>(schema1);
 BinaryDecoder decoder1 = DecoderFactory.get().binaryDecoder(byteData,
null);
GenericRecord result1 = r1.read(null, decoder1);

System.out.println(result1);
System.out.println(result1.get("schemaId"));
System.out.println(result1.get("lmd"));
 But somehow the above code prints out like this which is wrong... I am not
sure what wrong I did?

{"schemaId": 12, "lmd": -25}
        12
         -25
It should be printing out like this....

    {"schemaId": 20001, "lmd": 1379814280254L}
    20001
    1379814280254L

Can anyone help me what wrong I did?