Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> avro BinaryDecoder bug ?


Copy link to this message
-
Re: avro BinaryDecoder bug ?
Looks like a bug to me.

Can you file a JIRA ticket?

Thanks!

On 8/29/11 1:24 PM, "Yang" <[EMAIL PROTECTED]> wrote:

>if I read on a empty file with BinaryDecoder, I get EOF, good,
>
>but with the current code, if I read it again with the same decoder, I
>get a IndexOutofBoundException, not EOF.
>
>it seems that always giving EOF should be a more desirable behavior.
>
>you can see from this test code:
>
>import static org.junit.Assert.assertEquals;
>
>import java.io.IOException;
>
>import org.apache.avro.specific.SpecificRecord;
>import org.junit.Test;
>
>import myavro.Apple;
>
>import java.io.File;
>import java.io.FileInputStream;
>import java.io.FileNotFoundException;
>import java.io.FileOutputStream;
>import java.io.InputStream;
>import java.io.OutputStream;
>
>import org.apache.avro.io.Decoder;
>import org.apache.avro.io.DecoderFactory;
>import org.apache.avro.io.Encoder;
>import org.apache.avro.io.EncoderFactory;
>import org.apache.avro.specific.SpecificDatumReader;
>import org.apache.avro.specific.SpecificDatumWriter;
>
>class MyWriter {
>
>    SpecificDatumWriter<SpecificRecord> wr;
>    Encoder enc;
>    OutputStream ostream;
>
>    public MyWriter() throws FileNotFoundException {
>        wr = new SpecificDatumWriter<SpecificRecord>(new
>Apple().getSchema());
>        ostream = new FileOutputStream(new File("/tmp/testavro"));
>        enc = EncoderFactory.get().binaryEncoder(ostream, null);
>    }
>
>    public synchronized void dump(SpecificRecord event) throws
>IOException {
>        wr.write(event, enc);
>        enc.flush();
>    }
>
>}
>
>class MyReader {
>
>    SpecificDatumReader<SpecificRecord> rd;
>    Decoder dec;
>    InputStream istream;
>
>    public MyReader() throws FileNotFoundException {
>        rd = new SpecificDatumReader<SpecificRecord>(new
>Apple().getSchema());
>        istream = new FileInputStream(new File("/tmp/testavro"));
>        dec = DecoderFactory.get().binaryDecoder(istream, null);
>    }
>
>    public synchronized SpecificRecord read() throws IOException {
>        Object r = rd.read(null, dec);
>        return (SpecificRecord) r;
>    }
>
>}
>
>public class AvroWriteAndReadSameTime {
>    @Test
>    public void testWritingAndReadingAtSameTime() throws Exception {
>
>        MyWriter dumper = new MyWriter();
>        final Apple apple = new Apple();
>        apple.taste = "sweet";
>        dumper.dump(apple);
>
>        final MyReader rd = new MyReader();
>        rd.read();
>
>
>        try {
>        rd.read();
>        } catch (Exception e) {
>            e.printStackTrace();
>        }
>
>        // the second one somehow generates a NPE, we hope to get EOF...
>        try {
>        rd.read();
>        } catch (Exception e) {
>            e.printStackTrace();
>        }
>
>    }
>}
>
>
>
>
>
>the issue is in BinaryDecoder.readInt(), right now even when it hits
>EOF, it still advances the pos pointer.
>all the other APIs (readLong readFloat ...) do not do this. changing
>to the following  makes it work:
>
>
>  @Override
>  public int readInt() throws IOException {
>    ensureBounds(5); // won't throw index out of bounds
>    int len = 1;
>    int b = buf[pos] & 0xff;
>    int n = b & 0x7f;
>    if (b > 0x7f) {
>      b = buf[pos + len++] & 0xff;
>      n ^= (b & 0x7f) << 7;
>      if (b > 0x7f) {
>        b = buf[pos + len++] & 0xff;
>        n ^= (b & 0x7f) << 14;
>        if (b > 0x7f) {
>          b = buf[pos + len++] & 0xff;
>          n ^= (b & 0x7f) << 21;
>          if (b > 0x7f) {
>            b = buf[pos + len++] & 0xff;
>            n ^= (b & 0x7f) << 28;
>            if (b > 0x7f) {
>              throw new IOException("Invalid int encoding");
>            }
>          }
>        }
>      }
>    }
>    if (pos+len > limit) {
>      throw new EOFException();
>    }
>    pos += len;             //<================== CHANGE, used to be
>above the EOF throw
>
>    return (n >>> 1) ^ -(n & 1); // back to two's-complement
>  }