Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro, mail # user - Anonymous record schemas in data files


+
Eric Sammer 2013-03-04, 07:50
+
Doug Cutting 2013-03-04, 17:31
+
Eric Sammer 2013-03-04, 17:57
Copy link to this message
-
Re: Anonymous record schemas in data files
Doug Cutting 2013-03-04, 18:16
Schema.createRecord(List<Field>) should not be used except when
creating protocol parameter list schemas.  We should deprecate it in
the next point release and make it package-private in the following
release.  There are a few calls in other packages that use this, but
these could be replaced with calls to a new
Protocol#createMessageParameters method.

Doug
On Mon, Mar 4, 2013 at 9:57 AM, Eric Sammer <[EMAIL PROTECTED]> wrote:
> Freaky. The following works just fine.
>
> scala> val anonSchema = Schema.createRecord(Lists.newArrayList(new
> Field("foo", Schema.create(Type.STRING), null, null)))
> anonSchema: org.apache.avro.Schema > {"type":"record","fields":[{"name":"foo","type":"string"}]}
>
> scala> val writer = new DataFileWriter[Record](new
> GenericDatumWriter[Record](anonSchema))
> writer:
> org.apache.avro.file.DataFileWriter[org.apache.avro.generic.GenericData.Record]
> = org.apache.avro.file.DataFileWriter@417f6125
>
> scala> writer.create(anonSchema, new File("test-anon.avro"))
> res0:
> org.apache.avro.file.DataFileWriter[org.apache.avro.generic.GenericData.Record]
> = org.apache.avro.file.DataFileWriter@417f6125
> scala> writer.append(new GenericRecordBuilder(anonSchema).set("foo",
> "bar").build())
>
> scala> writer.flush()
>
> scala> writer.close()
>
> Of course, test-anon.avro can't be read back in any meaningful way, which is
> the problem. I'll file a JIRA. The question is, if Schema allows such a
> case, the semantic validation needs to exist in many places. I've been
> whining about the awkwardness of the Schema APIs (to Doug, at the office)
> for some time now. Maybe it's time we provided a set of builders that ensure
> semantic validity upon construction. I wouldn't mind putting in the work.
>
>
>
> On Mon, Mar 4, 2013 at 9:31 AM, Doug Cutting <[EMAIL PROTECTED]> wrote:
>>
>> As Francis noted, anonymous records are not permitted.  That said, the
>> runtime uses anonymous record schemas internally to implement message
>> parameter lists (which are written and read like records, but don't
>> have names).
>>
>> How did you manage to create a file containing an anonymous record?
>> Perhaps the API lets you create anonymous record schemas?  If so, we
>> should probably fix that, so they're only created by the Protocol
>> parser via a package-private API.
>>
>> Doug
>>
>> On Sun, Mar 3, 2013 at 11:50 PM, Eric Sammer <[EMAIL PROTECTED]> wrote:
>> > All:
>> >
>> > I'm looking for some clarity on the use of anonymous records in Avro
>> > data
>> > files. Is this considered legal? 1.7.3 allows one to write a data file
>> > with
>> > DataFileWriter with an anonymous record schema that can't be read back
>> > which
>> > is not the nicest behavior. Here's a contrived example of a data file:
>> >
>> > esammer:~/ esammer$ ~/bin/avro-tool getmeta 1362381940987-1
>> > Exception in thread "main" org.apache.avro.SchemaParseException: No name
>> > in
>> > schema: {"type":"record","fields":[{"name":"word","type":"string"}]}
>> >         at org.apache.avro.Schema.getRequiredText(Schema.java:1198)
>> >         at org.apache.avro.Schema.parse(Schema.java:1066)
>> >         at org.apache.avro.Schema$Parser.parse(Schema.java:927)
>> >         at org.apache.avro.Schema$Parser.parse(Schema.java:917)
>> >         at org.apache.avro.Schema.parse(Schema.java:974)
>> >         at
>> > org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:124)
>> >         at
>> > org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:97)
>> >         at
>> > org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:89)
>> >         at
>> >
>> > org.apache.avro.tool.DataFileGetMetaTool.run(DataFileGetMetaTool.java:63)
>> >         at org.apache.avro.tool.Main.run(Main.java:78)
>> >         at org.apache.avro.tool.Main.main(Main.java:67)
>> >
>> > Before I filed the bug I wanted to clarify that anonymous records are
>> > against the spec (or that they aren't, and the bug is the schema
>> > parser).
>> >
>> > Thanks.
>> >
+
Francis Galiegue 2013-03-04, 09:34