|
|
-
ReflectDatumReader and Writer classes
Peter Cameron 2012-07-05, 16:44
I'm a bit confused of the "status" of the ReflectDatumReader and ReflectDatumWriter classes. We use them to transparently serialise and deserialise JavaBeans-like objects. We automatically get an instance of the expected class back and do not need to write any conditional code based on what the schema is. Perfect in our Java world, and just what reflection is for. Now if I look at the package doc for org.apache.avro.reflect I see this: "This API is not recommended except as a stepping stone for systems that currently uses Java interfaces to define RPC protocols. For new RPC systems, the |specific| < http://avro.apache.org/docs/1.7.0/api/java/org/apache/avro/specific/package-summary.html> API is preferred. For systems that process dynamic data, the |generic| < http://avro.apache.org/docs/1.7.0/api/java/org/apache/avro/generic/package-summary.html> API is probably best." What I'm confused by is the assertion that the generic API is "probably best" for processing dynamic data. However, I can see no automatic, transparent way of using the generic readers and writers to serialise a JavaBean and then deserialise the stream into an instance of that JavaBean -- automatically. Writing conditional code and playing around with field names is not an answer. Am I missing something, or should your assertion be weakened? Peter
+
Peter Cameron 2012-07-05, 16:44
-
Re: ReflectDatumReader and Writer classes
Mark Hayes 2012-07-05, 17:09
On Thu, Jul 5, 2012 at 9:44 AM, Peter Cameron <[EMAIL PROTECTED]>wrote: > "This API is not recommended except as a stepping stone for systems that > currently uses Java interfaces to define RPC protocols. For new RPC > systems, the specific< http://avro.apache.org/docs/1.7.0/api/java/org/apache/avro/specific/package-summary.html>API is preferred. For systems that process dynamic data, the > generic< http://avro.apache.org/docs/1.7.0/api/java/org/apache/avro/generic/package-summary.html>API is probably best." > > What I'm confused by is the assertion that the generic API is "probably > best" for processing dynamic data. > I am still fairly new to Avro but I think what the warning in the docs is trying to say is that the Specific API is better for static data, because reflection is slower. If you're representing data using a Java bean, then your data is static (known at build time). --mark
+
Mark Hayes 2012-07-05, 17:09
-
Re: ReflectDatumReader and Writer classes
Peter Cameron 2012-07-05, 17:18
Let me explain further. Our data is not static. We do not know the type of Java object at runtime, as we only have the schema. We use the avro reflect package to transparently serialise and deserialise an Object instance given its schema. Ours is a black box that can serialise and deserialise any Object given a schema. We are given the Object to serialise by the caller, which is not under our control -- the only constraint is that both sides have the schema. The Specific readers/writers need code generation, and the generic readers and writers expect the objects to be "indexed records" and so barf. For any old POJO (with schema), the black box method can only be satisified by Avro's reflect package, unless I'm mistaken? Peter On 05/07/2012 18:09, Mark Hayes wrote: > On Thu, Jul 5, 2012 at 9:44 AM, Peter Cameron > <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote: > > "This API is not recommended except as a stepping stone for > systems that currently uses Java interfaces to define RPC > protocols. For new RPC systems, the |specific| > < http://avro.apache.org/docs/1.7.0/api/java/org/apache/avro/specific/package-summary.html>> API is preferred. For systems that process dynamic data, the > |generic| > < http://avro.apache.org/docs/1.7.0/api/java/org/apache/avro/generic/package-summary.html>> API is probably best." > > What I'm confused by is the assertion that the generic API is > "probably best" for processing dynamic data. > > > I am still fairly new to Avro but I think what the warning in the docs > is trying to say is that the Specific API is better for static data, > because reflection is slower. If you're representing data using a > Java bean, then your data is static (known at build time). > > --mark -- Peter Cameron 2iC Limited T: +44 (0) 208 123 7479 E: [EMAIL PROTECTED] W: www.2iCworld.com
+
Peter Cameron 2012-07-05, 17:18
-
Re: ReflectDatumReader and Writer classes
Doug Cutting 2012-07-05, 17:42
Regardless of how we define "dynamic" that statement in the documentation is confusing. Folks do find Avro reflection useful in many cases and we should improve that statement.
Perhaps we should instead just say something like: "Transparently supports simple classes. See below for details and limitations."
Would that be better?
Thanks,
Doug
On Thu, Jul 5, 2012 at 10:18 AM, Peter Cameron <[EMAIL PROTECTED]> wrote: > Let me explain further. Our data is not static. We do not know the type of > Java object at runtime, as we only have the schema. We use the avro reflect > package to transparently serialise and deserialise an Object instance given > its schema. Ours is a black box that can serialise and deserialise any > Object given a schema. We are given the Object to serialise by the caller, > which is not under our control -- the only constraint is that both sides > have the schema. The Specific readers/writers need code generation, and the > generic readers and writers expect the objects to be "indexed records" and > so barf. For any old POJO (with schema), the black box method can only be > satisified by Avro's reflect package, unless I'm mistaken? > > Peter > > > > On 05/07/2012 18:09, Mark Hayes wrote: > > On Thu, Jul 5, 2012 at 9:44 AM, Peter Cameron <[EMAIL PROTECTED]> > wrote: >> >> "This API is not recommended except as a stepping stone for systems that >> currently uses Java interfaces to define RPC protocols. For new RPC systems, >> the specific API is preferred. For systems that process dynamic data, the >> generic API is probably best." >> >> What I'm confused by is the assertion that the generic API is "probably >> best" for processing dynamic data. > > > I am still fairly new to Avro but I think what the warning in the docs is > trying to say is that the Specific API is better for static data, because > reflection is slower. If you're representing data using a Java bean, then > your data is static (known at build time). > > --mark > > > > -- > Peter Cameron > 2iC Limited > T: +44 (0) 208 123 7479 > E: [EMAIL PROTECTED] > W: www.2iCworld.com
+
Doug Cutting 2012-07-05, 17:42
-
Re: ReflectDatumReader and Writer classes
Peter Cameron 2012-07-05, 18:01
That sounds fine to me. As long as "simple" is clearly defined by the "details and limitations".
To us, such an object means a JavaBean with properties that are either primitive (where primitive maps to the Avro concept of primitive), or other JavaBeans. Since Avro is used for transmission over the wire, it's rather akin to a DTO. (We actually refer to these objects in code as "complex" because they are values that are not Avro primitives.)
cheers, Peter On 05/07/2012 18:42, Doug Cutting wrote: > Regardless of how we define "dynamic" that statement in the > documentation is confusing. Folks do find Avro reflection useful in > many cases and we should improve that statement. > > Perhaps we should instead just say something like: "Transparently > supports simple classes. See below for details and limitations." > > Would that be better? > > Thanks, > > Doug > > On Thu, Jul 5, 2012 at 10:18 AM, Peter Cameron > <[EMAIL PROTECTED]> wrote: >> Let me explain further. Our data is not static. We do not know the type of >> Java object at runtime, as we only have the schema. We use the avro reflect >> package to transparently serialise and deserialise an Object instance given >> its schema. Ours is a black box that can serialise and deserialise any >> Object given a schema. We are given the Object to serialise by the caller, >> which is not under our control -- the only constraint is that both sides >> have the schema. The Specific readers/writers need code generation, and the >> generic readers and writers expect the objects to be "indexed records" and >> so barf. For any old POJO (with schema), the black box method can only be >> satisified by Avro's reflect package, unless I'm mistaken? >> >> Peter >> >> >> >> On 05/07/2012 18:09, Mark Hayes wrote: >> >> On Thu, Jul 5, 2012 at 9:44 AM, Peter Cameron <[EMAIL PROTECTED]> >> wrote: >>> "This API is not recommended except as a stepping stone for systems that >>> currently uses Java interfaces to define RPC protocols. For new RPC systems, >>> the specific API is preferred. For systems that process dynamic data, the >>> generic API is probably best." >>> >>> What I'm confused by is the assertion that the generic API is "probably >>> best" for processing dynamic data. >> >> I am still fairly new to Avro but I think what the warning in the docs is >> trying to say is that the Specific API is better for static data, because >> reflection is slower. If you're representing data using a Java bean, then >> your data is static (known at build time). >> >> --mark >> >> >> >> -- >> Peter Cameron >> 2iC Limited >> T: +44 (0) 208 123 7479 >> E: [EMAIL PROTECTED] >> W: www.2iCworld.com -- Peter Cameron 2iC Limited T: +44 (0) 208 123 7479 E: [EMAIL PROTECTED] W: www.2iCworld.com
+
Peter Cameron 2012-07-05, 18:01
-
Re: ReflectDatumReader and Writer classes
Mark Hayes 2012-07-05, 18:26
Peter -- yes, you're right, I assumed that because you were using Java beans that you had a fixed set of them at build time. That was an incorrect assumption.
--mark
+
Mark Hayes 2012-07-05, 18:26
-
Re: ReflectDatumReader and Writer classes
Doug Cutting 2012-07-05, 18:18
Sounds like the word "simple" is confusing! So instead how about we just say, "See below for details and limitations"?
Doug
On Thu, Jul 5, 2012 at 11:01 AM, Peter Cameron <[EMAIL PROTECTED]> wrote: > That sounds fine to me. As long as "simple" is clearly defined by the > "details and limitations". > > To us, such an object means a JavaBean with properties that are either > primitive (where primitive maps to the Avro concept of primitive), or other > JavaBeans. Since Avro is used for transmission over the wire, it's rather > akin to a DTO. (We actually refer to these objects in code as "complex" > because they are values that are not Avro primitives.) > > cheers, > Peter > > > > On 05/07/2012 18:42, Doug Cutting wrote: >> >> Regardless of how we define "dynamic" that statement in the >> documentation is confusing. Folks do find Avro reflection useful in >> many cases and we should improve that statement. >> >> Perhaps we should instead just say something like: "Transparently >> supports simple classes. See below for details and limitations." >> >> Would that be better? >> >> Thanks, >> >> Doug >> >> On Thu, Jul 5, 2012 at 10:18 AM, Peter Cameron >> <[EMAIL PROTECTED]> wrote: >>> >>> Let me explain further. Our data is not static. We do not know the type >>> of >>> Java object at runtime, as we only have the schema. We use the avro >>> reflect >>> package to transparently serialise and deserialise an Object instance >>> given >>> its schema. Ours is a black box that can serialise and deserialise any >>> Object given a schema. We are given the Object to serialise by the >>> caller, >>> which is not under our control -- the only constraint is that both sides >>> have the schema. The Specific readers/writers need code generation, and >>> the >>> generic readers and writers expect the objects to be "indexed records" >>> and >>> so barf. For any old POJO (with schema), the black box method can only be >>> satisified by Avro's reflect package, unless I'm mistaken? >>> >>> Peter >>> >>> >>> >>> On 05/07/2012 18:09, Mark Hayes wrote: >>> >>> On Thu, Jul 5, 2012 at 9:44 AM, Peter Cameron >>> <[EMAIL PROTECTED]> >>> wrote: >>>> >>>> "This API is not recommended except as a stepping stone for systems that >>>> currently uses Java interfaces to define RPC protocols. For new RPC >>>> systems, >>>> the specific API is preferred. For systems that process dynamic data, >>>> the >>>> generic API is probably best." >>>> >>>> What I'm confused by is the assertion that the generic API is "probably >>>> best" for processing dynamic data. >>> >>> >>> I am still fairly new to Avro but I think what the warning in the docs is >>> trying to say is that the Specific API is better for static data, because >>> reflection is slower. If you're representing data using a Java bean, >>> then >>> your data is static (known at build time). >>> >>> --mark >>> >>> >>> >>> -- >>> Peter Cameron >>> 2iC Limited >>> T: +44 (0) 208 123 7479 >>> E: [EMAIL PROTECTED] >>> W: www.2iCworld.com > > > > -- > Peter Cameron > 2iC Limited > T: +44 (0) 208 123 7479 > E: [EMAIL PROTECTED] > W: www.2iCworld.com >
+
Doug Cutting 2012-07-05, 18:18
|
|