Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro, mail # user - Has anyone developed a utility to tell what is missing from a record?


+
Jonathan Coveney 2013-04-04, 13:58
+
Jeremy Kahn 2013-04-04, 14:07
+
Jonathan Coveney 2013-04-04, 16:02
Copy link to this message
-
Re: Has anyone developed a utility to tell what is missing from a record?
Philip Zeyliger 2013-04-04, 16:13
Hi Jonathan,

The python implementation is definitely less mature than the Java one.  As
you run into things, please do file bugs (and, better, yet, fixes!).

At one point someone on this list was working on an alternative python
implementation that generated python objects to represent the Avro records.
 I think that's a wise idea (and is what Thrift does).  Not sure where
that's gone.

-- Philip
On Thu, Apr 4, 2013 at 9:02 AM, Jonathan Coveney <[EMAIL PROTECTED]> wrote:

> I'm also running into issues where the Python and Java implementations are
> different (it seems like Java is less permissive than Python). Are these
> cases bugs? It can be frustrating for something to work in one but not the
> other.
>
> Having the info from the parallel recursion would allow us to have much
> better error messages. That would be great...
>
>
> 2013/4/4 Jeremy Kahn <[EMAIL PROTECTED]>
>
>> I think this would be tremendously useful.
>>
>> I am working - in my copious spare time - on improving schema validation
>> in the Python library, and I think I can see how to improve things there by
>> extending the data/schema parallel recursion to keep track of position in
>> each.
>>
>> Jeremy
>> On Apr 4, 2013 6:58 AM, "Jonathan Coveney" <[EMAIL PROTECTED]> wrote:
>>
>>> I'm working on migrating an internally developed serialization format to
>>> Avro. In the process, there have been many cases where I made a mistake
>>> migrating the schema (I've automated it), and then avro cries that a record
>>> I'm trying to serialize doesn't match the schema. Generally, the error it
>>> gives doesn't help find the actual issue, and for a big enough record
>>> finding the issue can be tedious.
>>>
>>> I've thought about making a tool which, given the schema and the record
>>> would tell you what the issue is, but I'm wondering if this already exists?
>>> I suppose the error message could also include this information...
>>>
>>> Thanks
>>> Jon
>>>
>>
>
+
Jonathan Coveney 2013-04-04, 16:16
+
Jeremy Kahn 2013-04-04, 16:27
+
Scott Carey 2013-04-06, 20:42