Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # user >> Anonymous record schemas in data files


+
Eric Sammer 2013-03-04, 07:50
+
Doug Cutting 2013-03-04, 17:31
+
Eric Sammer 2013-03-04, 17:57
Copy link to this message
-
Re: Anonymous record schemas in data files
Schema.createRecord(List<Field>) should not be used except when
creating protocol parameter list schemas.  We should deprecate it in
the next point release and make it package-private in the following
release.  There are a few calls in other packages that use this, but
these could be replaced with calls to a new
Protocol#createMessageParameters method.

Doug
On Mon, Mar 4, 2013 at 9:57 AM, Eric Sammer <[EMAIL PROTECTED]> wrote:
> Freaky. The following works just fine.
>
> scala> val anonSchema = Schema.createRecord(Lists.newArrayList(new
> Field("foo", Schema.create(Type.STRING), null, null)))
> anonSchema: org.apache.avro.Schema > {"type":"record","fields":[{"name":"foo","type":"string"}]}
>
> scala> val writer = new DataFileWriter[Record](new
> GenericDatumWriter[Record](anonSchema))
> writer:
> org.apache.avro.file.DataFileWriter[org.apache.avro.generic.GenericData.Record]
> = org.apache.avro.file.DataFileWriter@417f6125
>
> scala> writer.create(anonSchema, new File("test-anon.avro"))
> res0:
> org.apache.avro.file.DataFileWriter[org.apache.avro.generic.GenericData.Record]
> = org.apache.avro.file.DataFileWriter@417f6125
> scala> writer.append(new GenericRecordBuilder(anonSchema).set("foo",
> "bar").build())
>
> scala> writer.flush()
>
> scala> writer.close()
>
> Of course, test-anon.avro can't be read back in any meaningful way, which is
> the problem. I'll file a JIRA. The question is, if Schema allows such a
> case, the semantic validation needs to exist in many places. I've been
> whining about the awkwardness of the Schema APIs (to Doug, at the office)
> for some time now. Maybe it's time we provided a set of builders that ensure
> semantic validity upon construction. I wouldn't mind putting in the work.
>
>
>
> On Mon, Mar 4, 2013 at 9:31 AM, Doug Cutting <[EMAIL PROTECTED]> wrote:
>>
>> As Francis noted, anonymous records are not permitted.  That said, the
>> runtime uses anonymous record schemas internally to implement message
>> parameter lists (which are written and read like records, but don't
>> have names).
>>
>> How did you manage to create a file containing an anonymous record?
>> Perhaps the API lets you create anonymous record schemas?  If so, we
>> should probably fix that, so they're only created by the Protocol
>> parser via a package-private API.
>>
>> Doug
>>
>> On Sun, Mar 3, 2013 at 11:50 PM, Eric Sammer <[EMAIL PROTECTED]> wrote:
>> > All:
>> >
>> > I'm looking for some clarity on the use of anonymous records in Avro
>> > data
>> > files. Is this considered legal? 1.7.3 allows one to write a data file
>> > with
>> > DataFileWriter with an anonymous record schema that can't be read back
>> > which
>> > is not the nicest behavior. Here's a contrived example of a data file:
>> >
>> > esammer:~/ esammer$ ~/bin/avro-tool getmeta 1362381940987-1
>> > Exception in thread "main" org.apache.avro.SchemaParseException: No name
>> > in
>> > schema: {"type":"record","fields":[{"name":"word","type":"string"}]}
>> >         at org.apache.avro.Schema.getRequiredText(Schema.java:1198)
>> >         at org.apache.avro.Schema.parse(Schema.java:1066)
>> >         at org.apache.avro.Schema$Parser.parse(Schema.java:927)
>> >         at org.apache.avro.Schema$Parser.parse(Schema.java:917)
>> >         at org.apache.avro.Schema.parse(Schema.java:974)
>> >         at
>> > org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:124)
>> >         at
>> > org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:97)
>> >         at
>> > org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:89)
>> >         at
>> >
>> > org.apache.avro.tool.DataFileGetMetaTool.run(DataFileGetMetaTool.java:63)
>> >         at org.apache.avro.tool.Main.run(Main.java:78)
>> >         at org.apache.avro.tool.Main.main(Main.java:67)
>> >
>> > Before I filed the bug I wanted to clarify that anonymous records are
>> > against the spec (or that they aren't, and the bug is the schema
>> > parser).
>> >
>> > Thanks.
>> >
+
Francis Galiegue 2013-03-04, 09:34
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB