Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Deserialize the attributes data using another schema give me wrong results


Copy link to this message
-
Re: Deserialize the attributes data using another schema give me wrong results
@Erin/Doug/Mika... Any thoughts on my previous question?
Thanks for the help....
*Raihan Jamal*
On Wed, Sep 25, 2013 at 5:42 PM, Raihan Jamal <[EMAIL PROTECTED]> wrote:

> Thanks Eric. Now I have couple of questions on this-
>
> 1) So that means we cannot deserialize any attributes data using any other
> schema? We always need to pass the schema that we have used for writing
> along with any other schema that I want to use for reading purpose? Is that
> right?
> 2) Is there any way, I can deserialize any attributes data using any other
> schema without passing actual schema that we have to serialize?
>
> In my example if you see, I am already storing schemaId in the avro schema
> that will map to some actual schema name. So while serializing any
> attributes data, we will also store the schemaId within that avro binary
> encoded value, and that schemaId will represent this is the schema we have
> used to serialize it. Now while deserializing that attributes, firstly we
> will grab the schemaId (by deserializing it with another schema) and see
> which schema we have used actually to serialize that attributes and then we
> will deserialize that attributes again using the actual schema...
>
>
>
>
>
>
> *Raihan Jamal*
>
>
> On Wed, Sep 25, 2013 at 5:30 PM, Eric Wasserman <[EMAIL PROTECTED]>wrote:
>
>>  Short answer. Use this constructor instead:
>>
>>  /** Construct given writer's and reader's schema. */
>>
>>   public GenericDatumReader(Schema writer, Schema reader) {
>>
>>  Longer answer:
>>
>>  You have to give the GenericDatumReader the EXACT schema that wrote the
>> bytes that you are trying to parse ("writer's schema").
>> You can *also* give it another schema you'd like to use ("reader's
>> schema") that can be different.
>>
>>
>>  Try changing this line of your code:
>>
>>  GenericDatumReader<GenericRecord> r1 = new
>> GenericDatumReader<GenericRecord>(schema1);
>>
>>  To this:
>>
>>  GenericDatumReader<GenericRecord> r1 = new
>> GenericDatumReader<GenericRecord>(schema2, schema1); // writer's schema is
>> "schema2", reader's schema is "schema1"
>>
>>
>>  ------------------------------
>> *From:* Raihan Jamal <[EMAIL PROTECTED]>
>> *Sent:* Wednesday, September 25, 2013 5:10 PM
>> *To:* [EMAIL PROTECTED]
>> *Subject:* Deserialize the attributes data using another schema give me
>> wrong results
>>
>>   I am trying to serialize one of our Attributes Daya using Apache Avro
>> Schema. Here the attribute name is `e7` and the schema that I am using to
>> serialize it is `schema2.avsc` which is below.
>>
>>      {
>>      "namespace": "com.avro.test.AvroExperiment",
>>      "type": "record",
>>      "name": "DEMOGRAPHIC",
>>      "doc": "DEMOGRAPHIC data",
>>         "fields": [
>>             {"name": "dob", "type": "string"},
>>             {"name": "gndr",  "type": "string"},
>>             {"name": "occupation", "type": "string"},
>>     {"name": "mrtlStatus", "type": "string"},
>>     {"name": "numChldrn", "type": "int"},
>>     {"name": "estInc", "type": "string"},
>>     {"name": "schemaId", "type": "int"},
>>     {"name": "lmd", "type": "long"}
>>         ]
>>     }
>>
>>  Below is the code that I am using to serialize the attribute `e7` using
>> above avro `schema2.avsc`. And I am able to serialize it properly and it
>> works fine...
>>  Schema schema = new
>> Parser().parse((AvroExperiment.class.getResourceAsStream("/schema2.avsc")));
>> GenericRecord record = new GenericData.Record(schema);
>> record.put("dob", "161913600000");
>> record.put("gndr", "f");
>> record.put("occupation", "doctor");
>> record.put("mrtlStatus", "single");
>> record.put("numChldrn", 3);
>> record.put("estInc", "50000");
>> record.put("schemaId", 20001);
>> record.put("lmd", 1379814280254L);
>>
>>  GenericDatumWriter<GenericRecord> writer = new
>> GenericDatumWriter<GenericRecord>(schema);
>> ByteArrayOutputStream os = new ByteArrayOutputStream();
>>
>>  Encoder e = EncoderFactory.get().binaryEncoder(os, null);
>>
>>  writer.write(record, e);
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB