|
Bradford Stephens
2010-06-15, 04:20
Philip Zeyliger
2010-06-15, 15:57
Bradford Stephens
2010-06-16, 01:25
Scott Carey
2010-06-16, 01:56
Bradford Stephens
2010-06-16, 02:26
Bradford Stephens
2010-06-16, 02:32
Scott Carey
2010-06-16, 02:44
Bradford Stephens
2010-06-16, 02:54
Scott Carey
2010-06-16, 03:10
Scott Carey
2010-06-16, 03:15
Bradford Stephens
2010-06-16, 03:50
|
-
Serializing / Deserializing Java ObjectsBradford Stephens 2010-06-15, 04:20
Greetings,
I've poked around for examples of this, but I can't find any. I imagine it's a fairly common use case. I'm serializing some simple objects into bytes for placement onto RabbitMQ. My java class is pretty simple (but it'll grow in complexity in time).: { String[] Columns; } So, I made a .json schema containing this: { "namespace": "com.dts", "name": "QueueItem", "type": "record", "fields": [ {"name": "Columns", "type": ["null", {"type": "array", "items":"string"}]} ] } And generated a java class ... Now, I'm writing a test to serialize and deserialize some items. I can figure out the serialization, but not deserialization back to objects. Schema s = Schema.parse(new File("queuetype.json")); ByteArrayOutputStream bao = new ByteArrayOutputStream(); GenericDatumWriter w = new GenericDatumWriter(s); Encoder e = new BinaryEncoder(bao); e.init (bao); QueueItem r = new QueueItem(); r.put(0, items); w.write(r, e); e.flush(); ByteArrayInputStream is = new ByteArrayInputStream(bao.toByteArray()); DecoderFactory df = new DecoderFactory(); Decoder d = df.createBinaryDecoder(is, null); QueueItem itemout = (QueueItem) ..... And that's what I can't figure out -- how do I use a decoder method to create QueueItems? Cheers, B radford Stephens, Founder, Drawn to Scale drawntoscalehq.com 727.697.7528 http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution. Process, store, query, search, and serve all your data. http://www.roadtofailure.com -- The Fringes of Scalability, Social Media, and Computer Science
-
Re: Serializing / Deserializing Java ObjectsPhilip Zeyliger 2010-06-15, 15:57
Hi Bradford,
I believe you use a SpecificDatumReader. Something like: final static SpecicificDatumReader<QueueItem> QUEUE_ITEM_READER = new SpecificDatumReader<QueueItem>(QueueItem.class); QueueItem q = QUEUE_ITEM_READER.read(null, decoder); There doesn't seem to be a test that exercises this code path in an explanatory way, but java/src/java/org/apache/avro/ipc/Requestor.java uses something quite similar. -- Philip On Mon, Jun 14, 2010 at 9:20 PM, Bradford Stephens < [EMAIL PROTECTED]> wrote: > Greetings, > > I've poked around for examples of this, but I can't find any. I > imagine it's a fairly common use case. > > I'm serializing some simple objects into bytes for placement onto > RabbitMQ. My java class is pretty simple (but it'll grow in complexity > in time).: > > { > String[] Columns; > } > > > So, I made a .json schema containing this: > { > "namespace": "com.dts", > "name": "QueueItem", > "type": "record", > "fields": [ > {"name": "Columns", "type": ["null", {"type": "array", > "items":"string"}]} > ] > } > > > And generated a java class ... > > Now, I'm writing a test to serialize and deserialize some items. I can > figure out the serialization, but not deserialization back to objects. > > Schema s = Schema.parse(new File("queuetype.json")); > > ByteArrayOutputStream bao = new ByteArrayOutputStream(); > GenericDatumWriter w = new GenericDatumWriter(s); > Encoder e = new BinaryEncoder(bao); > e.init (bao); > > > QueueItem r = new QueueItem(); > r.put(0, items); > w.write(r, e); > e.flush(); > > ByteArrayInputStream is = new > ByteArrayInputStream(bao.toByteArray()); > DecoderFactory df = new DecoderFactory(); > Decoder d = df.createBinaryDecoder(is, null); > > QueueItem itemout = (QueueItem) ..... > > > And that's what I can't figure out -- how do I use a decoder method to > create QueueItems? > > Cheers, > B > > radford Stephens, > Founder, Drawn to Scale > drawntoscalehq.com > 727.697.7528 > > http://www.drawntoscalehq.com -- The intuitive, cloud-scale data > solution. Process, store, query, search, and serve all your data. > > http://www.roadtofailure.com -- The Fringes of Scalability, Social > Media, and Computer Science >
-
Re: Serializing / Deserializing Java ObjectsBradford Stephens 2010-06-16, 01:25
That makes sense -- I'm getting errors during serialization, though.
I convert my List<String> to Utf8[]. I then do a QueueItem.put() and get "Exception in thread "main" java.lang.ClassCastException: [Lorg.apache.avro.util.Utf8; cannot be cast to org.apache.avro.generic.GenericArray" How do I handle this Java->Avro interop? It seems to me that it should be a lot simpler... If I try to create a GenericArray<Utf8> and add items to that, it complains that my schema doesn't look right...so that doesn't feel like the right path. My generated class looks like this: @SuppressWarnings("all") public class QueueItem extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord { public static final org.apache.avro.Schema SCHEMA$ org.apache.avro.Schema.parse("{\"type\":\"record\",\"name\":\"QueueItem\",\"namespace\":\"com.dts\",\"fields\":[{\"name\":\"Columns\",\"type\":[\"null\",{\"type\":\"array\",\"items\":\"string\"}]}]}"); public org.apache.avro.generic.GenericArray<org.apache.avro.util.Utf8> Columns; public org.apache.avro.Schema getSchema() { return SCHEMA$; } public java.lang.Object get(int field$) { switch (field$) { case 0: return Columns; default: throw new org.apache.avro.AvroRuntimeException("Bad index"); } } @SuppressWarnings(value="unchecked") public void put(int field$, java.lang.Object value$) { switch (field$) { case 0: Columns (org.apache.avro.generic.GenericArray<org.apache.avro.util.Utf8>)value$; break; default: throw new org.apache.avro.AvroRuntimeException("Bad index"); } } } On Tue, Jun 15, 2010 at 8:57 AM, Philip Zeyliger <[EMAIL PROTECTED]> wrote: > Hi Bradford, > I believe you use a SpecificDatumReader. Something like: > > final static SpecicificDatumReader<QueueItem> QUEUE_ITEM_READER = new > SpecificDatumReader<QueueItem>(QueueItem.class); > QueueItem q = QUEUE_ITEM_READER.read(null, decoder); > There doesn't seem to be a test that exercises this code path in an > explanatory way, but java/src/java/org/apache/avro/ipc/Requestor.java uses > something quite similar. > -- Philip > > On Mon, Jun 14, 2010 at 9:20 PM, Bradford Stephens > <[EMAIL PROTECTED]> wrote: >> >> Greetings, >> >> I've poked around for examples of this, but I can't find any. I >> imagine it's a fairly common use case. >> >> I'm serializing some simple objects into bytes for placement onto >> RabbitMQ. My java class is pretty simple (but it'll grow in complexity >> in time).: >> >> { >> String[] Columns; >> } >> >> >> So, I made a .json schema containing this: >> { >> "namespace": "com.dts", >> "name": "QueueItem", >> "type": "record", >> "fields": [ >> {"name": "Columns", "type": ["null", {"type": "array", >> "items":"string"}]} >> ] >> } >> >> >> And generated a java class ... >> >> Now, I'm writing a test to serialize and deserialize some items. I can >> figure out the serialization, but not deserialization back to objects. >> >> Schema s = Schema.parse(new File("queuetype.json")); >> >> ByteArrayOutputStream bao = new ByteArrayOutputStream(); >> GenericDatumWriter w = new GenericDatumWriter(s); >> Encoder e = new BinaryEncoder(bao); >> e.init (bao); >> >> >> QueueItem r = new QueueItem(); >> r.put(0, items); >> w.write(r, e); >> e.flush(); >> >> ByteArrayInputStream is = new >> ByteArrayInputStream(bao.toByteArray()); >> DecoderFactory df = new DecoderFactory(); >> Decoder d = df.createBinaryDecoder(is, null); >> >> QueueItem itemout = (QueueItem) ..... >> >> >> And that's what I can't figure out -- how do I use a decoder method to >> create QueueItems? >> >> Cheers, >> B >> >> radford Stephens, >> Founder, Drawn to Scale >> drawntoscalehq.com >> 727.697.7528 >> >> http://www.drawntoscalehq.com -- The intuitive, cloud-scale data >> solution. Process, store, query, search, and serve all your data. Bradford Stephens, Founder, Drawn to Scale drawntoscalehq.com 727.697.7528 http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution. Process, store, query, search, and serve all your data. http://www.roadtofailure.com -- The Fringes of Scalability, Social Media, and Computer Science
-
Re: Serializing / Deserializing Java ObjectsScott Carey 2010-06-16, 01:56
Use GenericArray. The schema given to the generic array is not the schema of the member elements, but the actual array schema (yes it is confusing).
new GenericData.Array<Utf8>(size, Schema.createArray(Schema.create(Type.STRING)); It would be useful to be able to simply use Utf8[] or List<Utf8> for the Specific API, but at this time it leverages GenericData. On Jun 15, 2010, at 6:25 PM, Bradford Stephens wrote: > That makes sense -- I'm getting errors during serialization, though. > > I convert my List<String> to Utf8[]. > > I then do a QueueItem.put() and get "Exception in thread "main" > java.lang.ClassCastException: [Lorg.apache.avro.util.Utf8; cannot be > cast to org.apache.avro.generic.GenericArray" > > How do I handle this Java->Avro interop? It seems to me that it should > be a lot simpler... > > If I try to create a GenericArray<Utf8> and add items to that, it > complains that my schema doesn't look right...so that doesn't feel > like the right path. > > My generated class looks like this: > > @SuppressWarnings("all") > public class QueueItem extends > org.apache.avro.specific.SpecificRecordBase implements > org.apache.avro.specific.SpecificRecord { > public static final org.apache.avro.Schema SCHEMA$ > org.apache.avro.Schema.parse("{\"type\":\"record\",\"name\":\"QueueItem\",\"namespace\":\"com.dts\",\"fields\":[{\"name\":\"Columns\",\"type\":[\"null\",{\"type\":\"array\",\"items\":\"string\"}]}]}"); > > public org.apache.avro.generic.GenericArray<org.apache.avro.util.Utf8> > Columns; > public org.apache.avro.Schema getSchema() { return SCHEMA$; } > public java.lang.Object get(int field$) { > > > switch (field$) { > case 0: return Columns; > default: throw new org.apache.avro.AvroRuntimeException("Bad index"); > } > } > @SuppressWarnings(value="unchecked") > public void put(int field$, java.lang.Object value$) { > switch (field$) { > case 0: Columns > (org.apache.avro.generic.GenericArray<org.apache.avro.util.Utf8>)value$; > break; > default: throw new org.apache.avro.AvroRuntimeException("Bad index"); > } > } > } > > > > > On Tue, Jun 15, 2010 at 8:57 AM, Philip Zeyliger <[EMAIL PROTECTED]> wrote: >> Hi Bradford, >> I believe you use a SpecificDatumReader. Something like: >> >> final static SpecicificDatumReader<QueueItem> QUEUE_ITEM_READER = new >> SpecificDatumReader<QueueItem>(QueueItem.class); >> QueueItem q = QUEUE_ITEM_READER.read(null, decoder); >> There doesn't seem to be a test that exercises this code path in an >> explanatory way, but java/src/java/org/apache/avro/ipc/Requestor.java uses >> something quite similar. >> -- Philip >> >> On Mon, Jun 14, 2010 at 9:20 PM, Bradford Stephens >> <[EMAIL PROTECTED]> wrote: >>> >>> Greetings, >>> >>> I've poked around for examples of this, but I can't find any. I >>> imagine it's a fairly common use case. >>> >>> I'm serializing some simple objects into bytes for placement onto >>> RabbitMQ. My java class is pretty simple (but it'll grow in complexity >>> in time).: >>> >>> { >>> String[] Columns; >>> } >>> >>> >>> So, I made a .json schema containing this: >>> { >>> "namespace": "com.dts", >>> "name": "QueueItem", >>> "type": "record", >>> "fields": [ >>> {"name": "Columns", "type": ["null", {"type": "array", >>> "items":"string"}]} >>> ] >>> } >>> >>> >>> And generated a java class ... >>> >>> Now, I'm writing a test to serialize and deserialize some items. I can >>> figure out the serialization, but not deserialization back to objects. >>> >>> Schema s = Schema.parse(new File("queuetype.json")); >>> >>> ByteArrayOutputStream bao = new ByteArrayOutputStream(); >>> GenericDatumWriter w = new GenericDatumWriter(s); >>> Encoder e = new BinaryEncoder(bao); >>> e.init (bao); >>> >>> >>> QueueItem r = new QueueItem(); >>> r.put(0, items); >>> w.write(r, e); >>> e.flush();
-
Re: Serializing / Deserializing Java ObjectsBradford Stephens 2010-06-16, 02:26
That's.... erm, kinda bizarre.
But hey, it works! Thanks! On Tue, Jun 15, 2010 at 6:56 PM, Scott Carey <[EMAIL PROTECTED]> wrote: > Use GenericArray. The schema given to the generic array is not the schema of the member elements, but the actual array schema (yes it is confusing). > > new GenericData.Array<Utf8>(size, Schema.createArray(Schema.create(Type.STRING)); > > It would be useful to be able to simply use Utf8[] or List<Utf8> for the Specific API, but at this time it leverages GenericData. > > > On Jun 15, 2010, at 6:25 PM, Bradford Stephens wrote: > >> That makes sense -- I'm getting errors during serialization, though. >> >> I convert my List<String> to Utf8[]. >> >> I then do a QueueItem.put() and get "Exception in thread "main" >> java.lang.ClassCastException: [Lorg.apache.avro.util.Utf8; cannot be >> cast to org.apache.avro.generic.GenericArray" >> >> How do I handle this Java->Avro interop? It seems to me that it should >> be a lot simpler... >> >> If I try to create a GenericArray<Utf8> and add items to that, it >> complains that my schema doesn't look right...so that doesn't feel >> like the right path. >> >> My generated class looks like this: >> >> @SuppressWarnings("all") >> public class QueueItem extends >> org.apache.avro.specific.SpecificRecordBase implements >> org.apache.avro.specific.SpecificRecord { >> public static final org.apache.avro.Schema SCHEMA$ >> org.apache.avro.Schema.parse("{\"type\":\"record\",\"name\":\"QueueItem\",\"namespace\":\"com.dts\",\"fields\":[{\"name\":\"Columns\",\"type\":[\"null\",{\"type\":\"array\",\"items\":\"string\"}]}]}"); >> >> public org.apache.avro.generic.GenericArray<org.apache.avro.util.Utf8> >> Columns; >> public org.apache.avro.Schema getSchema() { return SCHEMA$; } >> public java.lang.Object get(int field$) { >> >> >> switch (field$) { >> case 0: return Columns; >> default: throw new org.apache.avro.AvroRuntimeException("Bad index"); >> } >> } >> @SuppressWarnings(value="unchecked") >> public void put(int field$, java.lang.Object value$) { >> switch (field$) { >> case 0: Columns >> (org.apache.avro.generic.GenericArray<org.apache.avro.util.Utf8>)value$; >> break; >> default: throw new org.apache.avro.AvroRuntimeException("Bad index"); >> } >> } >> } >> >> >> >> >> On Tue, Jun 15, 2010 at 8:57 AM, Philip Zeyliger <[EMAIL PROTECTED]> wrote: >>> Hi Bradford, >>> I believe you use a SpecificDatumReader. Something like: >>> >>> final static SpecicificDatumReader<QueueItem> QUEUE_ITEM_READER = new >>> SpecificDatumReader<QueueItem>(QueueItem.class); >>> QueueItem q = QUEUE_ITEM_READER.read(null, decoder); >>> There doesn't seem to be a test that exercises this code path in an >>> explanatory way, but java/src/java/org/apache/avro/ipc/Requestor.java uses >>> something quite similar. >>> -- Philip >>> >>> On Mon, Jun 14, 2010 at 9:20 PM, Bradford Stephens >>> <[EMAIL PROTECTED]> wrote: >>>> >>>> Greetings, >>>> >>>> I've poked around for examples of this, but I can't find any. I >>>> imagine it's a fairly common use case. >>>> >>>> I'm serializing some simple objects into bytes for placement onto >>>> RabbitMQ. My java class is pretty simple (but it'll grow in complexity >>>> in time).: >>>> >>>> { >>>> String[] Columns; >>>> } >>>> >>>> >>>> So, I made a .json schema containing this: >>>> { >>>> "namespace": "com.dts", >>>> "name": "QueueItem", >>>> "type": "record", >>>> "fields": [ >>>> {"name": "Columns", "type": ["null", {"type": "array", >>>> "items":"string"}]} >>>> ] >>>> } >>>> >>>> >>>> And generated a java class ... >>>> >>>> Now, I'm writing a test to serialize and deserialize some items. I can >>>> figure out the serialization, but not deserialization back to objects. >>>> >>>> Schema s = Schema.parse(new File("queuetype.json")); >>>> >>>> ByteArrayOutputStream bao = new ByteArrayOutputStream(); >>>> GenericDatumWriter w = new GenericDatumWriter(s); Bradford Stephens, Founder, Drawn to Scale drawntoscalehq.com 727.697.7528 http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution. Process, store, query, search, and serve all your data. http://www.roadtofailure.com -- The Fringes of Scalability, Social Media, and Computer Science
-
Re: Serializing / Deserializing Java ObjectsBradford Stephens 2010-06-16, 02:32
Another thing to help me understand the Avro philosophy...
When doing, public void put(int field$, java.lang.Object value$) Why is field an integer? For instance, I have a String[] Column in my object. In protobuf, it would generate java methods like .putColumn(String[] item). Is there a reason avro can't do that? Or did I run the code generator in avro-tools wrong? If that doesn't work, could we generate an enum of field names to pass in, instead? I don't like having to know "Magic Numbers" :) Cheers, B On Tue, Jun 15, 2010 at 7:26 PM, Bradford Stephens <[EMAIL PROTECTED]> wrote: > That's.... erm, kinda bizarre. > > But hey, it works! Thanks! > > > > On Tue, Jun 15, 2010 at 6:56 PM, Scott Carey <[EMAIL PROTECTED]> wrote: >> Use GenericArray. The schema given to the generic array is not the schema of the member elements, but the actual array schema (yes it is confusing). >> >> new GenericData.Array<Utf8>(size, Schema.createArray(Schema.create(Type.STRING)); >> >> It would be useful to be able to simply use Utf8[] or List<Utf8> for the Specific API, but at this time it leverages GenericData. >> >> >> On Jun 15, 2010, at 6:25 PM, Bradford Stephens wrote: >> >>> That makes sense -- I'm getting errors during serialization, though. >>> >>> I convert my List<String> to Utf8[]. >>> >>> I then do a QueueItem.put() and get "Exception in thread "main" >>> java.lang.ClassCastException: [Lorg.apache.avro.util.Utf8; cannot be >>> cast to org.apache.avro.generic.GenericArray" >>> >>> How do I handle this Java->Avro interop? It seems to me that it should >>> be a lot simpler... >>> >>> If I try to create a GenericArray<Utf8> and add items to that, it >>> complains that my schema doesn't look right...so that doesn't feel >>> like the right path. >>> >>> My generated class looks like this: >>> >>> @SuppressWarnings("all") >>> public class QueueItem extends >>> org.apache.avro.specific.SpecificRecordBase implements >>> org.apache.avro.specific.SpecificRecord { >>> public static final org.apache.avro.Schema SCHEMA$ >>> org.apache.avro.Schema.parse("{\"type\":\"record\",\"name\":\"QueueItem\",\"namespace\":\"com.dts\",\"fields\":[{\"name\":\"Columns\",\"type\":[\"null\",{\"type\":\"array\",\"items\":\"string\"}]}]}"); >>> >>> public org.apache.avro.generic.GenericArray<org.apache.avro.util.Utf8> >>> Columns; >>> public org.apache.avro.Schema getSchema() { return SCHEMA$; } >>> public java.lang.Object get(int field$) { >>> >>> >>> switch (field$) { >>> case 0: return Columns; >>> default: throw new org.apache.avro.AvroRuntimeException("Bad index"); >>> } >>> } >>> @SuppressWarnings(value="unchecked") >>> public void put(int field$, java.lang.Object value$) { >>> switch (field$) { >>> case 0: Columns >>> (org.apache.avro.generic.GenericArray<org.apache.avro.util.Utf8>)value$; >>> break; >>> default: throw new org.apache.avro.AvroRuntimeException("Bad index"); >>> } >>> } >>> } >>> >>> >>> >>> >>> On Tue, Jun 15, 2010 at 8:57 AM, Philip Zeyliger <[EMAIL PROTECTED]> wrote: >>>> Hi Bradford, >>>> I believe you use a SpecificDatumReader. Something like: >>>> >>>> final static SpecicificDatumReader<QueueItem> QUEUE_ITEM_READER = new >>>> SpecificDatumReader<QueueItem>(QueueItem.class); >>>> QueueItem q = QUEUE_ITEM_READER.read(null, decoder); >>>> There doesn't seem to be a test that exercises this code path in an >>>> explanatory way, but java/src/java/org/apache/avro/ipc/Requestor.java uses >>>> something quite similar. >>>> -- Philip >>>> >>>> On Mon, Jun 14, 2010 at 9:20 PM, Bradford Stephens >>>> <[EMAIL PROTECTED]> wrote: >>>>> >>>>> Greetings, >>>>> >>>>> I've poked around for examples of this, but I can't find any. I >>>>> imagine it's a fairly common use case. >>>>> >>>>> I'm serializing some simple objects into bytes for placement onto >>>>> RabbitMQ. My java class is pretty simple (but it'll grow in complexity >>>>> in time).: >>>>> >>>>> { >>>>> String[] Columns; >>>>> } Bradford Stephens, Founder, Drawn to Scale drawntoscalehq.com 727.697.7528 http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution. Process, store, query, search, and serve all your data. http://www.roadtofailure.com -- The Fringes of Scalability, Social Media, and Computer Science
-
Re: Serializing / Deserializing Java ObjectsScott Carey 2010-06-16, 02:44
This iteration of the SpecificAPI simply has public fields that are intended to be set directly.
The current best practice is to use wrapper classes or static helpers to interact with the generated objects so that most of your code is abstracted from the implementation details. put(field, value) is there for other internal Avro code moreso than users -- specifically it allows a ResolvingDecoder to automatically figure out where the data goes if the reader and writer's schemas differ. Definitely do NOT depend on the 'magic number' in your code. We should document that better. There is some discussion about the future of the Specific API so that it can generate getters/setters, and provide user controlled features -- potentially something like whether to use String[] or List<String> or Utf8[], etc to represent data in memory. More suggestions on how to improve the API are welcome. -Scott On Jun 15, 2010, at 7:32 PM, Bradford Stephens wrote: > Another thing to help me understand the Avro philosophy... > > When doing, public void put(int field$, java.lang.Object value$) > > Why is field an integer? > > For instance, I have a String[] Column in my object. In protobuf, it > would generate java methods like .putColumn(String[] item). Is there a > reason avro can't do that? Or did I run the code generator in > avro-tools wrong? > > If that doesn't work, could we generate an enum of field names to pass > in, instead? I don't like having to know "Magic Numbers" :) > > Cheers, > B > > > > > On Tue, Jun 15, 2010 at 7:26 PM, Bradford Stephens > <[EMAIL PROTECTED]> wrote: >> That's.... erm, kinda bizarre. >> >> But hey, it works! Thanks! >> >> >> >> On Tue, Jun 15, 2010 at 6:56 PM, Scott Carey <[EMAIL PROTECTED]> wrote: >>> Use GenericArray. The schema given to the generic array is not the schema of the member elements, but the actual array schema (yes it is confusing). >>> >>> new GenericData.Array<Utf8>(size, Schema.createArray(Schema.create(Type.STRING)); >>> >>> It would be useful to be able to simply use Utf8[] or List<Utf8> for the Specific API, but at this time it leverages GenericData. >>> >>> >>> On Jun 15, 2010, at 6:25 PM, Bradford Stephens wrote: >>> >>>> That makes sense -- I'm getting errors during serialization, though. >>>> >>>> I convert my List<String> to Utf8[]. >>>> >>>> I then do a QueueItem.put() and get "Exception in thread "main" >>>> java.lang.ClassCastException: [Lorg.apache.avro.util.Utf8; cannot be >>>> cast to org.apache.avro.generic.GenericArray" >>>> >>>> How do I handle this Java->Avro interop? It seems to me that it should >>>> be a lot simpler... >>>> >>>> If I try to create a GenericArray<Utf8> and add items to that, it >>>> complains that my schema doesn't look right...so that doesn't feel >>>> like the right path. >>>> >>>> My generated class looks like this: >>>> >>>> @SuppressWarnings("all") >>>> public class QueueItem extends >>>> org.apache.avro.specific.SpecificRecordBase implements >>>> org.apache.avro.specific.SpecificRecord { >>>> public static final org.apache.avro.Schema SCHEMA$ >>>> org.apache.avro.Schema.parse("{\"type\":\"record\",\"name\":\"QueueItem\",\"namespace\":\"com.dts\",\"fields\":[{\"name\":\"Columns\",\"type\":[\"null\",{\"type\":\"array\",\"items\":\"string\"}]}]}"); >>>> >>>> public org.apache.avro.generic.GenericArray<org.apache.avro.util.Utf8> >>>> Columns; >>>> public org.apache.avro.Schema getSchema() { return SCHEMA$; } >>>> public java.lang.Object get(int field$) { >>>> >>>> >>>> switch (field$) { >>>> case 0: return Columns; >>>> default: throw new org.apache.avro.AvroRuntimeException("Bad index"); >>>> } >>>> } >>>> @SuppressWarnings(value="unchecked") >>>> public void put(int field$, java.lang.Object value$) { >>>> switch (field$) { >>>> case 0: Columns >>>> (org.apache.avro.generic.GenericArray<org.apache.avro.util.Utf8>)value$; >>>> break; >>>> default: throw new org.apache.avro.AvroRuntimeException("Bad index");
-
Re: Serializing / Deserializing Java ObjectsBradford Stephens 2010-06-16, 02:54
Ah, interesting.
Then, is there a way to avoid manually making the .put(int, object) call that relies on the magic number? Or rather, what is the best practice for getting my Java object data into a generated Avro class so that it can be written? -B On Tue, Jun 15, 2010 at 7:44 PM, Scott Carey <[EMAIL PROTECTED]> wrote: > This iteration of the SpecificAPI simply has public fields that are intended to be set directly. > > The current best practice is to use wrapper classes or static helpers to interact with the generated objects so that most of your code is abstracted from the implementation details. > > put(field, value) is there for other internal Avro code moreso than users -- specifically it allows a ResolvingDecoder to automatically figure out where the data goes if the reader and writer's schemas differ. > > Definitely do NOT depend on the 'magic number' in your code. We should document that better. There is some discussion about the future of the Specific API so that it can generate getters/setters, and provide user controlled features -- potentially something like whether to use String[] or List<String> or Utf8[], etc to represent data in memory. More suggestions on how to improve the API are welcome. > > -Scott > > On Jun 15, 2010, at 7:32 PM, Bradford Stephens wrote: > >> Another thing to help me understand the Avro philosophy... >> >> When doing, public void put(int field$, java.lang.Object value$) >> >> Why is field an integer? >> >> For instance, I have a String[] Column in my object. In protobuf, it >> would generate java methods like .putColumn(String[] item). Is there a >> reason avro can't do that? Or did I run the code generator in >> avro-tools wrong? >> >> If that doesn't work, could we generate an enum of field names to pass >> in, instead? I don't like having to know "Magic Numbers" :) >> >> Cheers, >> B >> >> >> >> >> On Tue, Jun 15, 2010 at 7:26 PM, Bradford Stephens >> <[EMAIL PROTECTED]> wrote: >>> That's.... erm, kinda bizarre. >>> >>> But hey, it works! Thanks! >>> >>> >>> >>> On Tue, Jun 15, 2010 at 6:56 PM, Scott Carey <[EMAIL PROTECTED]> wrote: >>>> Use GenericArray. The schema given to the generic array is not the schema of the member elements, but the actual array schema (yes it is confusing). >>>> >>>> new GenericData.Array<Utf8>(size, Schema.createArray(Schema.create(Type.STRING)); >>>> >>>> It would be useful to be able to simply use Utf8[] or List<Utf8> for the Specific API, but at this time it leverages GenericData. >>>> >>>> >>>> On Jun 15, 2010, at 6:25 PM, Bradford Stephens wrote: >>>> >>>>> That makes sense -- I'm getting errors during serialization, though. >>>>> >>>>> I convert my List<String> to Utf8[]. >>>>> >>>>> I then do a QueueItem.put() and get "Exception in thread "main" >>>>> java.lang.ClassCastException: [Lorg.apache.avro.util.Utf8; cannot be >>>>> cast to org.apache.avro.generic.GenericArray" >>>>> >>>>> How do I handle this Java->Avro interop? It seems to me that it should >>>>> be a lot simpler... >>>>> >>>>> If I try to create a GenericArray<Utf8> and add items to that, it >>>>> complains that my schema doesn't look right...so that doesn't feel >>>>> like the right path. >>>>> >>>>> My generated class looks like this: >>>>> >>>>> @SuppressWarnings("all") >>>>> public class QueueItem extends >>>>> org.apache.avro.specific.SpecificRecordBase implements >>>>> org.apache.avro.specific.SpecificRecord { >>>>> public static final org.apache.avro.Schema SCHEMA$ >>>>> org.apache.avro.Schema.parse("{\"type\":\"record\",\"name\":\"QueueItem\",\"namespace\":\"com.dts\",\"fields\":[{\"name\":\"Columns\",\"type\":[\"null\",{\"type\":\"array\",\"items\":\"string\"}]}]}"); >>>>> >>>>> public org.apache.avro.generic.GenericArray<org.apache.avro.util.Utf8> >>>>> Columns; >>>>> public org.apache.avro.Schema getSchema() { return SCHEMA$; } >>>>> public java.lang.Object get(int field$) { >>>>> >>>>> >>>>> switch (field$) { >>>>> case 0: return Columns; Bradford Stephens, Founder, Drawn to Scale drawntoscalehq.com 727.697.7528 http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution. Process, store, query, search, and serve all your data. http://www.roadtofailure.com -- The Fringes of Scalability, Social Media, and Computer Science
-
Re: Serializing / Deserializing Java ObjectsScott Carey 2010-06-16, 03:10
QueueItem myItem = new QueueItem(); GenericArray<Utf8> cols = new GenericArray<Utf8>( ... ) ... Since the Columns field is public, instead of: myItem.put(index, cols); do: myItem.Columns = cols; On Jun 15, 2010, at 7:54 PM, Bradford Stephens wrote: > Ah, interesting. > > Then, is there a way to avoid manually making the .put(int, object) > call that relies on the magic number? > > Or rather, what is the best practice for getting my Java object data > into a generated Avro class so that it can be written? > > -B > > > > On Tue, Jun 15, 2010 at 7:44 PM, Scott Carey <[EMAIL PROTECTED]> wrote: >> This iteration of the SpecificAPI simply has public fields that are intended to be set directly. >> >> The current best practice is to use wrapper classes or static helpers to interact with the generated objects so that most of your code is abstracted from the implementation details. >> >> put(field, value) is there for other internal Avro code moreso than users -- specifically it allows a ResolvingDecoder to automatically figure out where the data goes if the reader and writer's schemas differ. >> >> Definitely do NOT depend on the 'magic number' in your code. We should document that better. There is some discussion about the future of the Specific API so that it can generate getters/setters, and provide user controlled features -- potentially something like whether to use String[] or List<String> or Utf8[], etc to represent data in memory. More suggestions on how to improve the API are welcome. >> >> -Scott >> >> On Jun 15, 2010, at 7:32 PM, Bradford Stephens wrote: >> >>> Another thing to help me understand the Avro philosophy... >>> >>> When doing, public void put(int field$, java.lang.Object value$) >>> >>> Why is field an integer? >>> >>> For instance, I have a String[] Column in my object. In protobuf, it >>> would generate java methods like .putColumn(String[] item). Is there a >>> reason avro can't do that? Or did I run the code generator in >>> avro-tools wrong? >>> >>> If that doesn't work, could we generate an enum of field names to pass >>> in, instead? I don't like having to know "Magic Numbers" :) >>> >>> Cheers, >>> B >>> >>> >>> >>> >>> On Tue, Jun 15, 2010 at 7:26 PM, Bradford Stephens >>> <[EMAIL PROTECTED]> wrote: >>>> That's.... erm, kinda bizarre. >>>> >>>> But hey, it works! Thanks! >>>> >>>> >>>> >>>> On Tue, Jun 15, 2010 at 6:56 PM, Scott Carey <[EMAIL PROTECTED]> wrote: >>>>> Use GenericArray. The schema given to the generic array is not the schema of the member elements, but the actual array schema (yes it is confusing). >>>>> >>>>> new GenericData.Array<Utf8>(size, Schema.createArray(Schema.create(Type.STRING)); >>>>> >>>>> It would be useful to be able to simply use Utf8[] or List<Utf8> for the Specific API, but at this time it leverages GenericData. >>>>> >>>>> >>>>> On Jun 15, 2010, at 6:25 PM, Bradford Stephens wrote: >>>>> >>>>>> That makes sense -- I'm getting errors during serialization, though. >>>>>> >>>>>> I convert my List<String> to Utf8[]. >>>>>> >>>>>> I then do a QueueItem.put() and get "Exception in thread "main" >>>>>> java.lang.ClassCastException: [Lorg.apache.avro.util.Utf8; cannot be >>>>>> cast to org.apache.avro.generic.GenericArray" >>>>>> >>>>>> How do I handle this Java->Avro interop? It seems to me that it should >>>>>> be a lot simpler... >>>>>> >>>>>> If I try to create a GenericArray<Utf8> and add items to that, it >>>>>> complains that my schema doesn't look right...so that doesn't feel >>>>>> like the right path. >>>>>> >>>>>> My generated class looks like this: >>>>>> >>>>>> @SuppressWarnings("all") >>>>>> public class QueueItem extends >>>>>> org.apache.avro.specific.SpecificRecordBase implements >>>>>> org.apache.avro.specific.SpecificRecord { >>>>>> public static final org.apache.avro.Schema SCHEMA$ >>>>>> org.apache.avro.Schema.parse("{\"type\":\"record\",\"name\":\"QueueItem\",\"namespace\":\"com.dts\",\"fields\":[{\"name\":\"Columns\",\"type\":[\"null\",{\"type\":\"array\",\"items\":\"string\"}]}]}");
-
Re: Serializing / Deserializing Java ObjectsScott Carey 2010-06-16, 03:15
The best practice would be to build a wrapper class that holds QueueItem, or to have a static method like:
static QueueItem createQueueItem(String[] columns) { QueueItem item = new QueueItem(); GenericArray<Utf8> cols = genCols(String[] columns); // helper method to conver to GenericArray item.Columns = cols; return item; } On Jun 15, 2010, at 8:10 PM, Scott Carey wrote: > > QueueItem myItem = new QueueItem(); > GenericArray<Utf8> cols = new GenericArray<Utf8>( ... ) ... > > Since the Columns field is public, instead of: > > myItem.put(index, cols); > > do: > myItem.Columns = cols; > > > > On Jun 15, 2010, at 7:54 PM, Bradford Stephens wrote: > >> Ah, interesting. >> >> Then, is there a way to avoid manually making the .put(int, object) >> call that relies on the magic number? >> >> Or rather, what is the best practice for getting my Java object data >> into a generated Avro class so that it can be written? >> >> -B >> >> >> >> On Tue, Jun 15, 2010 at 7:44 PM, Scott Carey <[EMAIL PROTECTED]> wrote: >>> This iteration of the SpecificAPI simply has public fields that are intended to be set directly. >>> >>> The current best practice is to use wrapper classes or static helpers to interact with the generated objects so that most of your code is abstracted from the implementation details. >>> >>> put(field, value) is there for other internal Avro code moreso than users -- specifically it allows a ResolvingDecoder to automatically figure out where the data goes if the reader and writer's schemas differ. >>> >>> Definitely do NOT depend on the 'magic number' in your code. We should document that better. There is some discussion about the future of the Specific API so that it can generate getters/setters, and provide user controlled features -- potentially something like whether to use String[] or List<String> or Utf8[], etc to represent data in memory. More suggestions on how to improve the API are welcome. >>> >>> -Scott >>> >>> On Jun 15, 2010, at 7:32 PM, Bradford Stephens wrote: >>> >>>> Another thing to help me understand the Avro philosophy... >>>> >>>> When doing, public void put(int field$, java.lang.Object value$) >>>> >>>> Why is field an integer? >>>> >>>> For instance, I have a String[] Column in my object. In protobuf, it >>>> would generate java methods like .putColumn(String[] item). Is there a >>>> reason avro can't do that? Or did I run the code generator in >>>> avro-tools wrong? >>>> >>>> If that doesn't work, could we generate an enum of field names to pass >>>> in, instead? I don't like having to know "Magic Numbers" :) >>>> >>>> Cheers, >>>> B >>>> >>>> >>>> >>>> >>>> On Tue, Jun 15, 2010 at 7:26 PM, Bradford Stephens >>>> <[EMAIL PROTECTED]> wrote: >>>>> That's.... erm, kinda bizarre. >>>>> >>>>> But hey, it works! Thanks! >>>>> >>>>> >>>>> >>>>> On Tue, Jun 15, 2010 at 6:56 PM, Scott Carey <[EMAIL PROTECTED]> wrote: >>>>>> Use GenericArray. The schema given to the generic array is not the schema of the member elements, but the actual array schema (yes it is confusing). >>>>>> >>>>>> new GenericData.Array<Utf8>(size, Schema.createArray(Schema.create(Type.STRING)); >>>>>> >>>>>> It would be useful to be able to simply use Utf8[] or List<Utf8> for the Specific API, but at this time it leverages GenericData. >>>>>> >>>>>> >>>>>> On Jun 15, 2010, at 6:25 PM, Bradford Stephens wrote: >>>>>> >>>>>>> That makes sense -- I'm getting errors during serialization, though. >>>>>>> >>>>>>> I convert my List<String> to Utf8[]. >>>>>>> >>>>>>> I then do a QueueItem.put() and get "Exception in thread "main" >>>>>>> java.lang.ClassCastException: [Lorg.apache.avro.util.Utf8; cannot be >>>>>>> cast to org.apache.avro.generic.GenericArray" >>>>>>> >>>>>>> How do I handle this Java->Avro interop? It seems to me that it should >>>>>>> be a lot simpler... >>>>>>> >>>>>>> If I try to create a GenericArray<Utf8> and add items to that, it >>>>>>> complains that my schema doesn't look right...so that doesn't feel
-
Re: Serializing / Deserializing Java ObjectsBradford Stephens 2010-06-16, 03:50
Ah, I didn't notice it was public. Thanks!
On Tue, Jun 15, 2010 at 8:15 PM, Scott Carey <[EMAIL PROTECTED]> wrote: > The best practice would be to build a wrapper class that holds QueueItem, or to have a static method like: > > static QueueItem createQueueItem(String[] columns) { > QueueItem item = new QueueItem(); > GenericArray<Utf8> cols = genCols(String[] columns); // helper method to conver to GenericArray > item.Columns = cols; > return item; > } > > > On Jun 15, 2010, at 8:10 PM, Scott Carey wrote: > >> >> QueueItem myItem = new QueueItem(); >> GenericArray<Utf8> cols = new GenericArray<Utf8>( ... ) ... >> >> Since the Columns field is public, instead of: >> >> myItem.put(index, cols); >> >> do: >> myItem.Columns = cols; >> >> >> >> On Jun 15, 2010, at 7:54 PM, Bradford Stephens wrote: >> >>> Ah, interesting. >>> >>> Then, is there a way to avoid manually making the .put(int, object) >>> call that relies on the magic number? >>> >>> Or rather, what is the best practice for getting my Java object data >>> into a generated Avro class so that it can be written? >>> >>> -B >>> >>> >>> >>> On Tue, Jun 15, 2010 at 7:44 PM, Scott Carey <[EMAIL PROTECTED]> wrote: >>>> This iteration of the SpecificAPI simply has public fields that are intended to be set directly. >>>> >>>> The current best practice is to use wrapper classes or static helpers to interact with the generated objects so that most of your code is abstracted from the implementation details. >>>> >>>> put(field, value) is there for other internal Avro code moreso than users -- specifically it allows a ResolvingDecoder to automatically figure out where the data goes if the reader and writer's schemas differ. >>>> >>>> Definitely do NOT depend on the 'magic number' in your code. We should document that better. There is some discussion about the future of the Specific API so that it can generate getters/setters, and provide user controlled features -- potentially something like whether to use String[] or List<String> or Utf8[], etc to represent data in memory. More suggestions on how to improve the API are welcome. >>>> >>>> -Scott >>>> >>>> On Jun 15, 2010, at 7:32 PM, Bradford Stephens wrote: >>>> >>>>> Another thing to help me understand the Avro philosophy... >>>>> >>>>> When doing, public void put(int field$, java.lang.Object value$) >>>>> >>>>> Why is field an integer? >>>>> >>>>> For instance, I have a String[] Column in my object. In protobuf, it >>>>> would generate java methods like .putColumn(String[] item). Is there a >>>>> reason avro can't do that? Or did I run the code generator in >>>>> avro-tools wrong? >>>>> >>>>> If that doesn't work, could we generate an enum of field names to pass >>>>> in, instead? I don't like having to know "Magic Numbers" :) >>>>> >>>>> Cheers, >>>>> B >>>>> >>>>> >>>>> >>>>> >>>>> On Tue, Jun 15, 2010 at 7:26 PM, Bradford Stephens >>>>> <[EMAIL PROTECTED]> wrote: >>>>>> That's.... erm, kinda bizarre. >>>>>> >>>>>> But hey, it works! Thanks! >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Jun 15, 2010 at 6:56 PM, Scott Carey <[EMAIL PROTECTED]> wrote: >>>>>>> Use GenericArray. The schema given to the generic array is not the schema of the member elements, but the actual array schema (yes it is confusing). >>>>>>> >>>>>>> new GenericData.Array<Utf8>(size, Schema.createArray(Schema.create(Type.STRING)); >>>>>>> >>>>>>> It would be useful to be able to simply use Utf8[] or List<Utf8> for the Specific API, but at this time it leverages GenericData. >>>>>>> >>>>>>> >>>>>>> On Jun 15, 2010, at 6:25 PM, Bradford Stephens wrote: >>>>>>> >>>>>>>> That makes sense -- I'm getting errors during serialization, though. >>>>>>>> >>>>>>>> I convert my List<String> to Utf8[]. >>>>>>>> >>>>>>>> I then do a QueueItem.put() and get "Exception in thread "main" >>>>>>>> java.lang.ClassCastException: [Lorg.apache.avro.util.Utf8; cannot be >>>>>>>> cast to org.apache.avro.generic.GenericArray" >>>>>>>> >> Bradford Stephens, Founder, Drawn to Scale drawntoscalehq.com 727.697.7528 http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution. Process, store, query, search, and serve all your data. http://www.roadtofailure.com -- The Fringes of Scalability, Social Media, and Computer Science |