Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # user >> Deserialize the attributes data using another schema give me wrong results


Copy link to this message
-
Deserialize the attributes data using another schema give me wrong results
I am trying to serialize one of our Attributes Daya using Apache Avro
Schema. Here the attribute name is `e7` and the schema that I am using to
serialize it is `schema2.avsc` which is below.

    {
     "namespace": "com.avro.test.AvroExperiment",
     "type": "record",
     "name": "DEMOGRAPHIC",
     "doc": "DEMOGRAPHIC data",
        "fields": [
            {"name": "dob", "type": "string"},
            {"name": "gndr",  "type": "string"},
            {"name": "occupation", "type": "string"},
    {"name": "mrtlStatus", "type": "string"},
    {"name": "numChldrn", "type": "int"},
    {"name": "estInc", "type": "string"},
    {"name": "schemaId", "type": "int"},
    {"name": "lmd", "type": "long"}
        ]
    }

Below is the code that I am using to serialize the attribute `e7` using
above avro `schema2.avsc`. And I am able to serialize it properly and it
works fine...
 Schema schema = new
Parser().parse((AvroExperiment.class.getResourceAsStream("/schema2.avsc")));
 GenericRecord record = new GenericData.Record(schema);
record.put("dob", "161913600000");
 record.put("gndr", "f");
record.put("occupation", "doctor");
 record.put("mrtlStatus", "single");
record.put("numChldrn", 3);
record.put("estInc", "50000");
 record.put("schemaId", 20001);
record.put("lmd", 1379814280254L);

 GenericDatumWriter<GenericRecord> writer = new
GenericDatumWriter<GenericRecord>(schema);
ByteArrayOutputStream os = new ByteArrayOutputStream();

Encoder e = EncoderFactory.get().binaryEncoder(os, null);

writer.write(record, e);
 e.flush();
byte[] byteData = os.toByteArray();
os.close();

Now, I tried deserializing the same `e7` attributes data using the same
above avro schema definition `schema2.avsc` and it also works fine, and I
am able to deserialize it properly.
 GenericDatumReader<GenericRecord> r = new
GenericDatumReader<GenericRecord>(schema);
BinaryDecoder decoder = DecoderFactory.get().binaryDecoder(byteData, null);
 GenericRecord result = r.read(null, decoder);

System.out.println(result);
System.out.println(result.get("schemaId"));
 System.out.println(result.get("lmd"));
Now I thought, lets deserialize the same attributes data using another avro
schema that I have which is `schema1.avsc` and just extract only `schemaId`
and `lmd` from that. Below is the schema-

    {
     "namespace": "com.avro.test.AvroExperiment",
     "type": "record",
     "name": "DEMOGRAPHIC",
     "doc": "DEMOGRAPHIC data",
        "fields": [
    {"name": "schemaId", "type": "int"},
    {"name": "lmd", "type": "long"}
        ]
    }
 /**
 * Deserialize the same byte data using another Avro Schema
 */

Schema schema1 = new
Parser().parse((AvroExperiment.class.getResourceAsStream("/schema1.avsc")));

GenericDatumReader<GenericRecord> r1 = new
GenericDatumReader<GenericRecord>(schema1);
 BinaryDecoder decoder1 = DecoderFactory.get().binaryDecoder(byteData,
null);
GenericRecord result1 = r1.read(null, decoder1);

System.out.println(result1);
System.out.println(result1.get("schemaId"));
System.out.println(result1.get("lmd"));
 But somehow the above code prints out like this which is wrong... I am not
sure what wrong I did?

{"schemaId": 12, "lmd": -25}
        12
         -25
It should be printing out like this....

    {"schemaId": 20001, "lmd": 1379814280254L}
    20001
    1379814280254L

Can anyone help me what wrong I did?
+
Eric Wasserman 2013-09-26, 00:30
+
Raihan Jamal 2013-09-26, 00:42
+
Raihan Jamal 2013-09-26, 07:33
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB