|
|
-
Confused about default values
Markus Weimer 2010-08-02, 23:28
Hi,
I added the following line to a schema, recreated the static java classes for it and compiled my code:
{"name": "bias", "type":"double", "default":"0.0"}
When I now try to read a file written before the change, I get an error:
Exception in thread "main" java.io.EOFException at org.apache.avro.io.BinaryDecoder.readDouble(BinaryDecoder.java:154) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:82) at org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java :273) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:74) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.jav a:154) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:72) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:61) I assumed that it would just return 0.0 for the fields not present in the file. Is this a bug on my end?
Thanks,
Markus
+
Markus Weimer 2010-08-02, 23:28
-
Re: Confused about default values
Doug Cutting 2010-08-02, 23:40
That sounds like something that should work. Can you submit a bug report, ideally with a complete test case? Thanks!
Doug
On 08/02/2010 04:28 PM, Markus Weimer wrote: > Hi, > > I added the following line to a schema, recreated the static java classes > for it and compiled my code: > > {"name": "bias", "type":"double", "default":"0.0"} > > When I now try to read a file written before the change, I get an error: > > Exception in thread "main" java.io.EOFException > at > org.apache.avro.io.BinaryDecoder.readDouble(BinaryDecoder.java:154) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:82) > at > org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java > :273) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:74) > at > org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.jav > a:154) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:72) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:61) > > > I assumed that it would just return 0.0 for the fields not present in the > file. Is this a bug on my end? > > Thanks, > > Markus >
+
Doug Cutting 2010-08-02, 23:40
-
Re: Confused about default values
Jeff Hammerbacher 2010-08-02, 23:43
Hey,
I think the issue is that you put "0.0" in quotes. Try just 0.0.
Later, Jeff
On Mon, Aug 2, 2010 at 4:40 PM, Doug Cutting <[EMAIL PROTECTED]> wrote:
> That sounds like something that should work. Can you submit a bug report, > ideally with a complete test case? Thanks! > > Doug > > > On 08/02/2010 04:28 PM, Markus Weimer wrote: > >> Hi, >> >> I added the following line to a schema, recreated the static java classes >> for it and compiled my code: >> >> {"name": "bias", "type":"double", "default":"0.0"} >> >> When I now try to read a file written before the change, I get an error: >> >> Exception in thread "main" java.io.EOFException >> at >> org.apache.avro.io.BinaryDecoder.readDouble(BinaryDecoder.java:154) >> at >> >> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:82) >> at >> >> org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java >> :273) >> at >> >> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:74) >> at >> >> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.jav >> a:154) >> at >> >> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:72) >> at >> >> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:61) >> >> >> I assumed that it would just return 0.0 for the fields not present in the >> file. Is this a bug on my end? >> >> Thanks, >> >> Markus >> >>
+
Jeff Hammerbacher 2010-08-02, 23:43
-
Re: Confused about default values
Jeff Hammerbacher 2010-08-02, 23:44
Also, if this turns out to be the issue, please file a JIRA to ensure we provide a clear error message to the user when we parse the schema.
On Mon, Aug 2, 2010 at 4:43 PM, Jeff Hammerbacher <[EMAIL PROTECTED]>wrote:
> Hey, > > I think the issue is that you put "0.0" in quotes. Try just 0.0. > > Later, > Jeff > > > On Mon, Aug 2, 2010 at 4:40 PM, Doug Cutting <[EMAIL PROTECTED]> wrote: > >> That sounds like something that should work. Can you submit a bug report, >> ideally with a complete test case? Thanks! >> >> Doug >> >> >> On 08/02/2010 04:28 PM, Markus Weimer wrote: >> >>> Hi, >>> >>> I added the following line to a schema, recreated the static java classes >>> for it and compiled my code: >>> >>> {"name": "bias", "type":"double", "default":"0.0"} >>> >>> When I now try to read a file written before the change, I get an error: >>> >>> Exception in thread "main" java.io.EOFException >>> at >>> org.apache.avro.io.BinaryDecoder.readDouble(BinaryDecoder.java:154) >>> at >>> >>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:82) >>> at >>> >>> org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java >>> :273) >>> at >>> >>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:74) >>> at >>> >>> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.jav >>> a:154) >>> at >>> >>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:72) >>> at >>> >>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:61) >>> >>> >>> I assumed that it would just return 0.0 for the fields not present in the >>> file. Is this a bug on my end? >>> >>> Thanks, >>> >>> Markus >>> >>> >
+
Jeff Hammerbacher 2010-08-02, 23:44
-
Re: Confused about default values
Scott Carey 2010-08-03, 01:01
How was this GenericDatumReader constructed? Is it used to read from an Avro file or from something else?
Note that you may have to set the "expected" schema separately from the actual schema. Avro needs to know what the schema was when it was written, in the Avro data file this is persisted with it and automatically set when read. On Aug 2, 2010, at 4:28 PM, Markus Weimer wrote:
> Hi, > > I added the following line to a schema, recreated the static java classes > for it and compiled my code: > > {"name": "bias", "type":"double", "default":"0.0"} > > When I now try to read a file written before the change, I get an error: > > Exception in thread "main" java.io.EOFException > at > org.apache.avro.io.BinaryDecoder.readDouble(BinaryDecoder.java:154) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:82) > at > org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java > :273) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:74) > at > org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.jav > a:154) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:72) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:61) > > > I assumed that it would just return 0.0 for the fields not present in the > file. Is this a bug on my end? > > Thanks, > > Markus >
+
Scott Carey 2010-08-03, 01:01
-
Re: Confused about default values
Markus Weimer 2010-08-04, 17:31
Hi,
Thanks for the suggestions! I tried changing to 0.0 as opposed to "0.0" with no success. Please note that I am on AVRO 1.2, as there is an incompatibility between hadoop 0.20 and newer versions of avro.
It seems that the question how I (de-)serialized the object could lead to an answer. I read the avro instance directly from an inputstream. The data in the stream has been serialized using the following code:
public static void store(final SpecificRecord m, final OutputStream out) throws IOException { final SpecificDatumWriter datumWriter = new SpecificDatumWriter(m.getSchema()); final BinaryEncoder enc = new BinaryEncoder(out); datumWriter.write(m, enc); enc.flush(); }
I read from the stream using:
public static SpecificRecord load(final InputStream in) throws IOException { final SpecificDatumReader reader = new SpecificDatumReader(THECLASS._SCHEMA); final BinaryDecoder decoder = new BinaryDecoder(in); return ( SpecificRecord ) reader.read(null, decoder); }
Presumably, this does not serialize the schema with the data, correct? That would explain the problem. I know that avro files do serialize the schema at the beginning. Is there a similar tool for writing to streams?
Thanks,
Markus
On 8/2/10 6:01 PM, "Scott Carey" <[EMAIL PROTECTED]> wrote:
How was this GenericDatumReader constructed? Is it used to read from an Avro file or from something else?
Note that you may have to set the "expected" schema separately from the actual schema. Avro needs to know what the schema was when it was written, in the Avro data file this is persisted with it and automatically set when read. On Aug 2, 2010, at 4:28 PM, Markus Weimer wrote:
> Hi, > > I added the following line to a schema, recreated the static java classes > for it and compiled my code: > > {"name": "bias", "type":"double", "default":"0.0"} > > When I now try to read a file written before the change, I get an error: > > Exception in thread "main" java.io.EOFException > at > org.apache.avro.io.BinaryDecoder.readDouble(BinaryDecoder.java:154) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:82) > at > org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java > :273) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:74) > at > org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.jav > a:154) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:72) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:61) > > > I assumed that it would just return 0.0 for the fields not present in the > file. Is this a bug on my end? > > Thanks, > > Markus >
+
Markus Weimer 2010-08-04, 17:31
|
|