Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> references to other schemas


Copy link to this message
-
Re: references to other schemas
Backticks are allowed inside of strings, though, so whatever
preprocessor was used would have to have some understanding of JSON.
This reduces the preprocessor options for that.

I'm fairly neutral on the idea of composite schemas, overall. The
biggest problem I have is that JSON has no standard way of referring
to URLs (in the HTML5 sense) and they seem to be the best way to do
this.

On schema read, the references could be loaded once and kept that way
in order to have a complete schema on RPC and datafile write.
Basically, we would say references will be used on read, but not on
write.
--
Jeff

On May 3, 2010 10:03 AM, "Doug Cutting" <[EMAIL PROTECTED]> wrote:

Scott Carey wrote:
>
> There has been talk that AvroGen would handle features like this (as well as ...

Note that JSON schemas and protocols need to be standalone, containing
the full lexical closure of schemas referenced, when they are included
in data files and exchanged in RPC handshakes without reference to
external data.  Thus I am reluctant to add a JSON syntax for file
inclusion.  Rather, I think a pre-processor is appropriate.  The
pre-processor would not be run on schemas included in files or
exchanged in RPC handshakes, but would be run for schemas read from
files.

I have experimented with using the m4 pre-processor for this purpose,
and found it a bit awkward.  Perhaps someone can develop macros for m4
that make it palatable, or perhaps we can develop a custom
pre-processor for JSON.

We might exploit otherwise-illegal JSON syntax, like backquotes, for
pre-processor directives.  An include might look something like:

{"protocol": "org.foo.BarProtocol",
 "types": [
  `include org.foo.Bar`,
   ...
 ]
}

Also note that a protocol file (.avpr) need not actually define any
messages but can be used to define a set of types that reference one
another.  This is a stopgap, but a useful one.

Doug