Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro, mail # user - references to other schemas


Copy link to this message
-
references to other schemas
Jay Kreps 2010-05-02, 19:18
I want to have a shared type schema which would be used by 50 or so
messages (say a type Header defined in a single place that all
messages would use), and I can't seem to find a way to do this (though
I may just have missed it).

This could be done either by an "import" statement in the .avsc file
as protocol buffers does, but I do not think that really makes sense
in a world of non-statically compiled schemas. Probably a better way
is just to make a type "Xyz" resolve to the schema of that type. Then
just to open up these methods, and make the SpecificCompiler take lots
of files, resolve all the inter-references, and then generate a bunch
of classes instead of a single file. The resulting schema would have
no reference to Xyz, but rather would directly include the schema for
Xyz in its place.

This looks like it can *almost* be done using some internal private methods:

/* this package protected method parses wrt the given names. Header
could be given here if I understand correctly */
Schema.parse(JsonNode schema, Names names)

/* compile multiple schemas into multiple files*/
s = SpecificCompiler()
s.enqueue(header)
s.enqueue(schemaUsingHeader)
outputFiles = s.compile()

Is this kind of thing handled in some other way I have just missed? If
not any objection to a patch that opens up these methods and adds
options to SpecificCompiler to jointly compile a bunch of files all at
once? Perhaps this is already in flight?

-Jay