Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Using Arrays in Apache Avro


Copy link to this message
-
Re: Using Arrays in Apache Avro

On Sep 24, 2013, at 9:46 PM, Raihan Jamal <[EMAIL PROTECTED]> wrote:

> Thanks a lot Mika. Yeah, it works now but my second question is- Does the avro schema that I have made looks good as compared to JSON value that we were using previously?
> I thought we can use an array for that so designed like that using an Apache Avro..
>

This is an application design question, and not related to Avro. If you have a list of prices,  array is a good place to store them.

> And also why Avro Array uses java.util.List datatype? Just curious to know on that as well.

Someone who has actually designed Avro can answer this better, but I assume that List was chosen because it is much more convenient to use than java arrays. You don't need to know the size before hand, etc.

-Mika

>
> Thanks for the help.
>
>
>
>
>
>
>
> Raihan Jamal
>
>
> On Tue, Sep 24, 2013 at 11:40 AM, Mika Ristimaki <[EMAIL PROTECTED]> wrote:
> Hi,
>
> Avro array uses java.util.List datatype. So you must do something like
>
> List<Double> nums = new ArrayList<Double>();
> nums.add(new Double(9.97));
> .
> .
>
> On Sep 24, 2013, at 9:02 PM, Raihan Jamal <[EMAIL PROTECTED]> wrote:
>
>> Earlier, I was using JSON in our project so one of our attribute data looks like below in JSON format. Below is the attribute `e3` data in JSON format.
>>
>> {"lv":[{"v":{"prc":9.97}},{"v":{"prc":5.56}},{"v":{"prc":21.48}}]}
>>
>> Now, I am planning to use Apache Avro for our Data Serialization format. So I decided to design the Avro schema for the above attributes data. And I came up with the below design.
>>  
>>   {
>>      "namespace": "com.avro.test.AvroExperiment",
>>      "type": "record",
>>      "name": "AVG_PRICE",
>>      "doc": "AVG_PRICE data",
>>      "fields": [
>>          {"name": "prc", "type": {"type": "array", "items": "double"}}
>>      ]
>>     }
>>
>> Now, I am not sure whether the above schema looks right or not corresponding to the values I have in JSON? Can anyone help me on that? Assuming the above schema looks correct, if I try to serialize the data using the above avro schema, I always get the below error-
>>  
>> double[] nums = new double[] { 9.97, 5.56, 21.48 };
>>
>> Schema schema = new Parser().parse((AvroExperiment.class.getResourceAsStream("/aspmc.avsc")));
>> GenericRecord record = new GenericData.Record(schema);
>> record.put("prc", nums);
>>
>> GenericDatumWriter<GenericRecord> writer = new GenericDatumWriter<GenericRecord>(schema);
>> ByteArrayOutputStream os = new ByteArrayOutputStream();
>>
>> Encoder e = EncoderFactory.get().binaryEncoder(os, null);
>>
>> // this line gives me exception..
>> writer.write(record, e);
>>
>> Below is the exception, I always get-
>>
>>     Exception in thread "main" java.lang.ClassCastException: [D incompatible with java.util.Collection
>>
>> Any idea what wrong I am doing here?
>
>

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB