Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Has anyone developed a utility to tell what is missing from a record?

Copy link to this message
Re: Has anyone developed a utility to tell what is missing from a record?
Ok, cool. I've been using the python implementation pretty heavily and
didn't realize that it was less mature. Will definitely work on maturing it
where possible :)
2013/4/4 Philip Zeyliger <[EMAIL PROTECTED]>

> Hi Jonathan,
> The python implementation is definitely less mature than the Java one.  As
> you run into things, please do file bugs (and, better, yet, fixes!).
> At one point someone on this list was working on an alternative python
> implementation that generated python objects to represent the Avro records.
>  I think that's a wise idea (and is what Thrift does).  Not sure where
> that's gone.
> -- Philip
> On Thu, Apr 4, 2013 at 9:02 AM, Jonathan Coveney <[EMAIL PROTECTED]>wrote:
>> I'm also running into issues where the Python and Java implementations
>> are different (it seems like Java is less permissive than Python). Are
>> these cases bugs? It can be frustrating for something to work in one but
>> not the other.
>> Having the info from the parallel recursion would allow us to have much
>> better error messages. That would be great...
>> 2013/4/4 Jeremy Kahn <[EMAIL PROTECTED]>
>>> I think this would be tremendously useful.
>>> I am working - in my copious spare time - on improving schema validation
>>> in the Python library, and I think I can see how to improve things there by
>>> extending the data/schema parallel recursion to keep track of position in
>>> each.
>>> Jeremy
>>> On Apr 4, 2013 6:58 AM, "Jonathan Coveney" <[EMAIL PROTECTED]> wrote:
>>>> I'm working on migrating an internally developed serialization format
>>>> to Avro. In the process, there have been many cases where I made a mistake
>>>> migrating the schema (I've automated it), and then avro cries that a record
>>>> I'm trying to serialize doesn't match the schema. Generally, the error it
>>>> gives doesn't help find the actual issue, and for a big enough record
>>>> finding the issue can be tedious.
>>>> I've thought about making a tool which, given the schema and the record
>>>> would tell you what the issue is, but I'm wondering if this already exists?
>>>> I suppose the error message could also include this information...
>>>> Thanks
>>>> Jon