Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Deserialize the attributes data using another schema give me wrong results


Copy link to this message
-
RE: Deserialize the attributes data using another schema give me wrong results
Short answer. Use this constructor instead:

 /** Construct given writer's and reader's schema. */

  public GenericDatumReader(Schema writer, Schema reader) {

Longer answer:

You have to give the GenericDatumReader the EXACT schema that wrote the bytes that you are trying to parse ("writer's schema").
You can *also* give it another schema you'd like to use ("reader's schema") that can be different.
Try changing this line of your code:

GenericDatumReader<GenericRecord> r1 = new GenericDatumReader<GenericRecord>(schema1);

To this:

GenericDatumReader<GenericRecord> r1 = new GenericDatumReader<GenericRecord>(schema2, schema1); // writer's schema is "schema2", reader's schema is "schema1"
________________________________
From: Raihan Jamal <[EMAIL PROTECTED]>
Sent: Wednesday, September 25, 2013 5:10 PM
To: [EMAIL PROTECTED]
Subject: Deserialize the attributes data using another schema give me wrong results

I am trying to serialize one of our Attributes Daya using Apache Avro Schema. Here the attribute name is `e7` and the schema that I am using to serialize it is `schema2.avsc` which is below.

    {
     "namespace": "com.avro.test.AvroExperiment",
     "type": "record",
     "name": "DEMOGRAPHIC",
     "doc": "DEMOGRAPHIC data",
        "fields": [
            {"name": "dob", "type": "string"},
            {"name": "gndr",  "type": "string"},
            {"name": "occupation", "type": "string"},
    {"name": "mrtlStatus", "type": "string"},
    {"name": "numChldrn", "type": "int"},
    {"name": "estInc", "type": "string"},
    {"name": "schemaId", "type": "int"},
    {"name": "lmd", "type": "long"}
        ]
    }

Below is the code that I am using to serialize the attribute `e7` using above avro `schema2.avsc`. And I am able to serialize it properly and it works fine...
Schema schema = new Parser().parse((AvroExperiment.class.getResourceAsStream("/schema2.avsc")));
GenericRecord record = new GenericData.Record(schema);
record.put("dob", "161913600000");
record.put("gndr", "f");
record.put("occupation", "doctor");
record.put("mrtlStatus", "single");
record.put("numChldrn", 3);
record.put("estInc", "50000");
record.put("schemaId", 20001);
record.put("lmd", 1379814280254L);

GenericDatumWriter<GenericRecord> writer = new GenericDatumWriter<GenericRecord>(schema);
ByteArrayOutputStream os = new ByteArrayOutputStream();

Encoder e = EncoderFactory.get().binaryEncoder(os, null);

writer.write(record, e);
e.flush();
byte[] byteData = os.toByteArray();
os.close();

Now, I tried deserializing the same `e7` attributes data using the same above avro schema definition `schema2.avsc` and it also works fine, and I am able to deserialize it properly.
GenericDatumReader<GenericRecord> r = new GenericDatumReader<GenericRecord>(schema);
BinaryDecoder decoder = DecoderFactory.get().binaryDecoder(byteData, null);
GenericRecord result = r.read(null, decoder);

System.out.println(result);
System.out.println(result.get("schemaId"));
System.out.println(result.get("lmd"));
Now I thought, lets deserialize the same attributes data using another avro schema that I have which is `schema1.avsc` and just extract only `schemaId` and `lmd` from that. Below is the schema-

    {
     "namespace": "com.avro.test.AvroExperiment",
     "type": "record",
     "name": "DEMOGRAPHIC",
     "doc": "DEMOGRAPHIC data",
        "fields": [
    {"name": "schemaId", "type": "int"},
    {"name": "lmd", "type": "long"}
        ]
    }
/**
* Deserialize the same byte data using another Avro Schema
*/

Schema schema1 = new Parser().parse((AvroExperiment.class.getResourceAsStream("/schema1.avsc")));

GenericDatumReader<GenericRecord> r1 = new GenericDatumReader<GenericRecord>(schema1);
BinaryDecoder decoder1 = DecoderFactory.get().binaryDecoder(byteData, null);
GenericRecord result1 = r1.read(null, decoder1);

System.out.println(result1);
System.out.println(result1.get("schemaId"));
System.out.println(result1.get("lmd"));
But somehow the above code prints out like this which is wrong... I am not sure what wrong I did?

{"schemaId": 12, "lmd": -25}
        12
         -25
It should be printing out like this....

    {"schemaId": 20001, "lmd": 1379814280254L}
    20001
    1379814280254L

Can anyone help me what wrong I did?
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB