Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> avro BinaryDecoder bug ?


Copy link to this message
-
Re: avro BinaryDecoder bug ?
Looks like a bug to me.

Can you file a JIRA ticket?

Thanks!

On 8/29/11 1:24 PM, "Yang" <[EMAIL PROTECTED]> wrote:

>if I read on a empty file with BinaryDecoder, I get EOF, good,
>
>but with the current code, if I read it again with the same decoder, I
>get a IndexOutofBoundException, not EOF.
>
>it seems that always giving EOF should be a more desirable behavior.
>
>you can see from this test code:
>
>import static org.junit.Assert.assertEquals;
>
>import java.io.IOException;
>
>import org.apache.avro.specific.SpecificRecord;
>import org.junit.Test;
>
>import myavro.Apple;
>
>import java.io.File;
>import java.io.FileInputStream;
>import java.io.FileNotFoundException;
>import java.io.FileOutputStream;
>import java.io.InputStream;
>import java.io.OutputStream;
>
>import org.apache.avro.io.Decoder;
>import org.apache.avro.io.DecoderFactory;
>import org.apache.avro.io.Encoder;
>import org.apache.avro.io.EncoderFactory;
>import org.apache.avro.specific.SpecificDatumReader;
>import org.apache.avro.specific.SpecificDatumWriter;
>
>class MyWriter {
>
>    SpecificDatumWriter<SpecificRecord> wr;
>    Encoder enc;
>    OutputStream ostream;
>
>    public MyWriter() throws FileNotFoundException {
>        wr = new SpecificDatumWriter<SpecificRecord>(new
>Apple().getSchema());
>        ostream = new FileOutputStream(new File("/tmp/testavro"));
>        enc = EncoderFactory.get().binaryEncoder(ostream, null);
>    }
>
>    public synchronized void dump(SpecificRecord event) throws
>IOException {
>        wr.write(event, enc);
>        enc.flush();
>    }
>
>}
>
>class MyReader {
>
>    SpecificDatumReader<SpecificRecord> rd;
>    Decoder dec;
>    InputStream istream;
>
>    public MyReader() throws FileNotFoundException {
>        rd = new SpecificDatumReader<SpecificRecord>(new
>Apple().getSchema());
>        istream = new FileInputStream(new File("/tmp/testavro"));
>        dec = DecoderFactory.get().binaryDecoder(istream, null);
>    }
>
>    public synchronized SpecificRecord read() throws IOException {
>        Object r = rd.read(null, dec);
>        return (SpecificRecord) r;
>    }
>
>}
>
>public class AvroWriteAndReadSameTime {
>    @Test
>    public void testWritingAndReadingAtSameTime() throws Exception {
>
>        MyWriter dumper = new MyWriter();
>        final Apple apple = new Apple();
>        apple.taste = "sweet";
>        dumper.dump(apple);
>
>        final MyReader rd = new MyReader();
>        rd.read();
>
>
>        try {
>        rd.read();
>        } catch (Exception e) {
>            e.printStackTrace();
>        }
>
>        // the second one somehow generates a NPE, we hope to get EOF...
>        try {
>        rd.read();
>        } catch (Exception e) {
>            e.printStackTrace();
>        }
>
>    }
>}
>
>
>
>
>
>the issue is in BinaryDecoder.readInt(), right now even when it hits
>EOF, it still advances the pos pointer.
>all the other APIs (readLong readFloat ...) do not do this. changing
>to the following  makes it work:
>
>
>  @Override
>  public int readInt() throws IOException {
>    ensureBounds(5); // won't throw index out of bounds
>    int len = 1;
>    int b = buf[pos] & 0xff;
>    int n = b & 0x7f;
>    if (b > 0x7f) {
>      b = buf[pos + len++] & 0xff;
>      n ^= (b & 0x7f) << 7;
>      if (b > 0x7f) {
>        b = buf[pos + len++] & 0xff;
>        n ^= (b & 0x7f) << 14;
>        if (b > 0x7f) {
>          b = buf[pos + len++] & 0xff;
>          n ^= (b & 0x7f) << 21;
>          if (b > 0x7f) {
>            b = buf[pos + len++] & 0xff;
>            n ^= (b & 0x7f) << 28;
>            if (b > 0x7f) {
>              throw new IOException("Invalid int encoding");
>            }
>          }
>        }
>      }
>    }
>    if (pos+len > limit) {
>      throw new EOFException();
>    }
>    pos += len;             //<================== CHANGE, used to be
>above the EOF throw
>
>    return (n >>> 1) ^ -(n & 1); // back to two's-complement
>  }
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB