|
Steven Willis
2012-12-11, 19:28
Dave McAlpin
2012-12-11, 20:31
Steven Willis
2012-12-11, 21:42
Doug Cutting
2012-12-11, 22:54
Steven Willis
2012-12-11, 23:47
Doug Cutting
2012-12-12, 00:25
Dave McAlpin
2012-12-11, 23:45
|
-
How to handle schema dependenciesSteven Willis 2012-12-11, 19:28
Hi,
My company currently has one big repo that holds all our java code and avro schemas. I'm currently splitting it up into one common repo and separate repos for each product. This has been easy to do for our java code using maven and dependencies, however I can't find a way to do this with our avro schemas. In the common repo I've got schemas that everything relies on. These common schemas include some very domain specific stuff along with some pretty general use schemas that we've defined like Date: {"name": "Date", "namespace": "com.compete.avro", "type": "record", "fields": [ { "name":"year", "type":"int" }, { "name":"month", "type":"int" }, { "name":"day", "type":"int" } ] } I'd like to be able to use 'Date' (defined in the common repo) in schemas inside the product-xyz repo. But when I try this, I get: [ERROR] FATAL ERROR [INFO] ------------------------------------------------------------------------ [INFO] "Date" is not a defined name. The type of the "date" field must be a defined name or a {"type": ...} expression. [INFO] ------------------------------------------------------------------------ [INFO] Trace org.apache.avro.SchemaParseException: "Date" is not a defined name. The type of the "date" field must be a defined name or a {"type": ...} expression. at org.apache.avro.Schema.parse(Schema.java:1094) at org.apache.avro.Schema.parse(Schema.java:1163) at org.apache.avro.Schema$Parser.parse(Schema.java:931) at org.apache.avro.Schema$Parser.parse(Schema.java:908) at org.apache.avro.compiler.specific.SpecificCompiler.compileSchema(SpecificCompiler.java:182) at org.apache.avro.compiler.specific.SpecificCompiler.compileSchema(SpecificCompiler.java:174) at org.apache.avro.mojo.SchemaMojo.doCompile(SchemaMojo.java:53) at org.apache.avro.mojo.AbstractAvroMojo.compileFiles(AbstractAvroMojo.java:129) at org.apache.avro.mojo.AbstractAvroMojo.execute(AbstractAvroMojo.java:99) at org.apache.maven.plugin.DefaultPluginManager.executeMojo(DefaultPluginManager.java:490) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:694) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalWithLifecycle(DefaultLifecycleExecutor.java:556) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(DefaultLifecycleExecutor.java:535) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHandleFailures(DefaultLifecycleExecutor.java:387) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegments(DefaultLifecycleExecutor.java:348) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLifecycleExecutor.java:180) at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:328) at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:138) at org.apache.maven.cli.MavenCli.main(MavenCli.java:362) at org.apache.maven.cli.compat.CompatibleMain.main(CompatibleMain.java:60) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315) at org.codehaus.classworlds.Launcher.launch(Launcher.java:255) at org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430) at org.codehaus.classworlds.Launcher.main(Launcher.java:375) The common jar which contains the compiled Date class is available in our maven repo... is there some way to use that? I'm currently using the avro-maven-plugin to do the java code generation, is there an option to this plugin to specify schemas or jars to include? It seems like the only work around is to put all avro schemas that we might use in any product in the common repo, or duplicate all the common schemas inside each product specific repo. -Steven Willis +
Steven Willis 2012-12-11, 19:28
-
RE: How to handle schema dependenciesDave McAlpin 2012-12-11, 20:31
I had the same problem. My solution was to package external schema files into a source jar and have Maven download and extract those source jars at code generation time. After generation, I delete the external schema along with their generated code and depend on an external jar file at runtime.
I use IDL instead of Avro schema, so this approach might not work for you, but here's what I did. In the external project (the one I want to import), I changed the pom.xml to package schemas into a source jar. Note that the fragment below assumes that the Avro schema is stored in src/main/schema, but there's nothing special about that location. Also note that I exclude generated Java files from the source jar. <build> <resources> <resource> <directory>${project.basedir}/src/main/schema</directory> </resource> </resources> <plugins> . . . <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-source-plugin</artifactId> <version>2.2.1</version> <executions> <execution> <id>attach-avdl</id> <phase>verify</phase> <goals> <goal>jar-no-fork</goal> </goals> <configuration> <includePom>true</includePom> <excludes> <exclude>**/*.java</exclude> </excludes> <includes> <include>*.avdl</include> </includes> </configuration> </execution> </executions> </plugin> </plugins> </build> In the project that uses the external schemas, I changed the pom.xml to pull in those schemas as external dependencies and delete them after code generation is complete. I also delete the generated java files that result from those external schema because I want to use the generated class files from an external jar rather than the locally generated versions. ***PLEASE NOTE*** that the code below deletes files as part of clean up. To use this, you'll need to update the PATH/TO placeholders. If you try this on a real project, MAKE SURE it's backed up before you start testing this. <build> <plugins> . . . <plugin> <artifactId>maven-clean-plugin</artifactId> <version>2.5</version> <executions> <execution> <id>clean-generated-java</id> <phase>clean</phase> <goals> <goal>clean</goal> </goals> <configuration> <filesets> <fileset> <directory>src/main/java/PATH/TO/YOUR/GENERATED/DIRECTORY</directory> <includes> <include>*.java</include> </includes> </fileset> </filesets> </configuration> </execution> <execution> <id>postgen-clean</id> <phase>compile</phase> <goals> <goal>clean</goal> </goals> <configuration> <excludeDefaultDirectories>true</excludeDefaultDirectories> <filesets> <fileset> <directory>src/main/schema</directory> <includes> <include>external/*</include> <include>external</include> </includes> </fileset> <fileset> <directory>src/main/java</directory> <includes> <include>**/*</include> </includes> <excludes> <exclude>PATH/TO/YOUR/GENERATED/SOURCE/DIRECTORY/RELATIVE/TO/SRC/MAIN/JAVA/*</exclude> </excludes> </fileset> </filesets> </configuration> </execution> </executions> </plugin> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-dependency-plugin</artifactId> <version>2.5.1</version> <executions> <execution> <id>import-avdl</id> <phase>initialize</phase> <goals> <goal>unpack-dependencies</goal> </goals> <configuration> <includes>*.avdl</includes> <!-- Limit group ids like this to avoid pulling down source for third party projects <includeGroupIds>com.example,net.example</includeGroupIds> --> <classifier>sources</classifier> <failOnMissingClassifierArtifact>true</failOnMissingClassifierArtifact> <outputDirectory>src/main/schema/external</outputDirectory> </configuration> </execution> </executions> </plugin> </plugins> </build> In the IDL file that uses these external schemas, I make my imports point to an "external" sub-directory, like this: import idl "external/Profile.avdl"; Because I'm pulling all dependent schema files into a single common directory named "external", schema file names need to be unique across all projects. That's not a problem for me, but if it is for you, you could come up with a more sophisticated way to unpack and re +
Dave McAlpin 2012-12-11, 20:31
-
RE: How to handle schema dependenciesSteven Willis 2012-12-11, 21:42
Thanks Dave,
I was thinking about doing something like that (adding the schemas as resources in the jars). It just seems like a lot of work for something that could be automatic. It would be nice if during schema parsing we could specify a classpath to be used for dynamic lookup of external schemas. That way Schema.parse could look up already schemas that have already been compiled like this: package com.compete.util; import org.apache.avro.Schema; import org.apache.avro.generic.GenericContainer; public class Schemas { public static void main(String args[]) { String name = "Date"; String space = "com.compete.avro"; try { // If we were inside Schema.parse we would probably do: // Class.forName(new Name(name, space).toString()) Class<?> cls = Class.forName(space+"."+name); GenericContainer record = (GenericContainer)cls.newInstance(); Schema schema = record.getSchema(); // if we were inside Schema.parse(JsonNode schema, Names names) we // could now just call: names.put(new Name(name, space), schema); System.out.println(schema.toString()); } catch(ClassNotFoundException e) { System.err.println(e); System.exit(1); } catch(InstantiationException e) { System.err.println(e); System.exit(1); } catch(IllegalAccessException e) { System.err.println(e); System.exit(1); } } } -Steven Willis > -----Original Message----- > From: Dave McAlpin [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, December 11, 2012 3:32 PM > To: [EMAIL PROTECTED] > Subject: RE: How to handle schema dependencies > > I had the same problem. My solution was to package external schema > files into a source jar and have Maven download and extract those > source jars at code generation time. After generation, I delete the > external schema along with their generated code and depend on an > external jar file at runtime. > > I use IDL instead of Avro schema, so this approach might not work for > you, but here's what I did. > > In the external project (the one I want to import), I changed the > pom.xml to package schemas into a source jar. Note that the fragment > below assumes that the Avro schema is stored in src/main/schema, but > there's nothing special about that location. Also note that I exclude > generated Java files from the source jar. > > <build> > <resources> > <resource> > <directory>${project.basedir}/src/main/schema</directory> > </resource> > </resources> > <plugins> > . > . > . > <plugin> > <groupId>org.apache.maven.plugins</groupId> > <artifactId>maven-source-plugin</artifactId> > <version>2.2.1</version> > <executions> > <execution> > <id>attach-avdl</id> > <phase>verify</phase> > <goals> > <goal>jar-no-fork</goal> > </goals> > <configuration> > <includePom>true</includePom> > <excludes> > <exclude>**/*.java</exclude> > </excludes> > <includes> > <include>*.avdl</include> > </includes> > </configuration> > </execution> > </executions> > </plugin> > </plugins> > </build> > > > In the project that uses the external schemas, I changed the pom.xml to > pull in those schemas as external dependencies and delete them after > code generation is complete. I also delete the generated java files > that result from those external schema because I want to use the > generated class files from an external jar rather than the locally > generated versions. > > ***PLEASE NOTE*** that the code below deletes files as part of clean +
Steven Willis 2012-12-11, 21:42
-
Re: How to handle schema dependenciesDoug Cutting 2012-12-11, 22:54
On Tue, Dec 11, 2012 at 1:42 PM, Steven Willis <[EMAIL PROTECTED]> wrote:
> It would be nice if during schema parsing we could specify a classpath to be used for dynamic lookup of external schemas Does AVRO-1188 (included in Avro 1.7.3) help here? https://issues.apache.org/jira/browse/AVRO-1188 This permits one to specify directories of schemas to import to Avro Maven executions. Doug +
Doug Cutting 2012-12-11, 22:54
-
RE: How to handle schema dependenciesSteven Willis 2012-12-11, 23:47
Hi Doug,
That certainly helps a lot. Though one would still need extra code to pull out the required schema files from a jar into that imports directory prior to code generation. And one would need to track these avro dependencies manually. It would be great if the avro-maven-plugin could have configuration items specified like "com.example:common" and those artifacts would be searched for compiled avro types using something like the code I posted earlier in addition to the normal parser name resolution. -Steven Willis > -----Original Message----- > From: Doug Cutting [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, December 11, 2012 5:55 PM > To: [EMAIL PROTECTED] > Subject: Re: How to handle schema dependencies > > On Tue, Dec 11, 2012 at 1:42 PM, Steven Willis <[EMAIL PROTECTED]> > wrote: > > It would be nice if during schema parsing we could specify a > classpath > > to be used for dynamic lookup of external schemas > > Does AVRO-1188 (included in Avro 1.7.3) help here? > > https://issues.apache.org/jira/browse/AVRO-1188 > > This permits one to specify directories of schemas to import to Avro > Maven executions. > > Doug +
Steven Willis 2012-12-11, 23:47
-
Re: How to handle schema dependenciesDoug Cutting 2012-12-12, 00:25
We might add a protected Schema.Parser#getName() method so that folks
could subclass Schema.Parser to look for names in the environment, then use such a subclass in the Maven Mojo implementations. Please file an issue in Jira if this is of interest to you. On Tue, Dec 11, 2012 at 3:47 PM, Steven Willis <[EMAIL PROTECTED]> wrote: > Hi Doug, > > That certainly helps a lot. Though one would still need extra code to pull out the required schema files from a jar into that imports directory prior to code generation. And one would need to track these avro dependencies manually. It would be great if the avro-maven-plugin could have configuration items specified like "com.example:common" and those artifacts would be searched for compiled avro types using something like the code I posted earlier in addition to the normal parser name resolution. > > -Steven Willis > >> -----Original Message----- >> From: Doug Cutting [mailto:[EMAIL PROTECTED]] >> Sent: Tuesday, December 11, 2012 5:55 PM >> To: [EMAIL PROTECTED] >> Subject: Re: How to handle schema dependencies >> >> On Tue, Dec 11, 2012 at 1:42 PM, Steven Willis <[EMAIL PROTECTED]> >> wrote: >> > It would be nice if during schema parsing we could specify a >> classpath >> > to be used for dynamic lookup of external schemas >> >> Does AVRO-1188 (included in Avro 1.7.3) help here? >> >> https://issues.apache.org/jira/browse/AVRO-1188 >> >> This permits one to specify directories of schemas to import to Avro >> Maven executions. >> >> Doug +
Doug Cutting 2012-12-12, 00:25
-
RE: How to handle schema dependenciesDave McAlpin 2012-12-11, 23:45
That enhancement allows my approach to be used with avsc files as well as IDL files.
Here's our use case. Project A defines the IDL for a record of type A. Something like this: **File a.avdl @namespace("com.example.a") protocol AService { record A { union {null, string} aData = null; } Project A provides a jar file with the generated Java classes for A. Project B creates an IDL that uses type A. Something like this: **File b.avdl @namespace("com.example.b") protocol BService { import idl "a.avdl" record B { union {null, com.example.a.A} a = null; } Ideally, Project B doesn't need access to Project A's source code; it will depend only on the jar file from Project A. There are two problems. First, the Avro code generator in Project B needs access to a.avdl in order to satisfy the import in b.avdl, but a.avdl is only available in project A. Second, the Avro code generator will generate Java classes for A in Project B, which duplicate the Java files already available in the jar file in Project A. My solution is to have Maven package up a.avdl in Project A's source jar. I then add some Maven-ness to project B to do the following at code generation time. 1) pull down the sources jar from Project A 2) unpack any avdl files from Project A's sources jar into a "external" directory under Project B's schema directory 3) run the Avro generator on Project B's schema 4) delete the "external" directory and its contents 5) delete any Java files generated as a result of the avdl files in Project B's "external" directory This allows Project B to do something like 'import idl "external/a.avdl"' and have that external dependency satisfied by Maven at code generation time without having explicit access to Project A's source. Dave -----Original Message----- From: Doug Cutting [mailto:[EMAIL PROTECTED]] Sent: Tuesday, December 11, 2012 2:55 PM To: [EMAIL PROTECTED] Subject: Re: How to handle schema dependencies On Tue, Dec 11, 2012 at 1:42 PM, Steven Willis <[EMAIL PROTECTED]> wrote: > It would be nice if during schema parsing we could specify a classpath > to be used for dynamic lookup of external schemas Does AVRO-1188 (included in Avro 1.7.3) help here? https://issues.apache.org/jira/browse/AVRO-1188 This permits one to specify directories of schemas to import to Avro Maven executions. Doug +
Dave McAlpin 2012-12-11, 23:45
|