Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # dev >> bug in SequenceFile.sync()?


+
Christopher Ng 2013-06-24, 09:20
+
Colin McCabe 2013-06-24, 16:39
+
Christopher Ng 2013-06-24, 17:20
Copy link to this message
-
Re: bug in SequenceFile.sync()?
Hi Christopher,

indeed, I think that the noBufferedKeys and valuesDecompressed should be
reset.

Regards
JB

On 06/24/2013 11:20 AM, Christopher Ng wrote:
> cross-posting this from cdh-users group where it received little interest:
>
> is there a bug in SequenceFile.sync()?  This is from cdh4.3.0:
>
>      /** Seek to the next sync mark past a given position.*/
>      public synchronized void sync(long position) throws IOException {
>        if (position+SYNC_SIZE >= end) {
>          seek(end);
>          return;
>        }
>
>        if (position < headerEnd) {
>          // seek directly to first record
>          in.seek(headerEnd);                                         <===> should this not call seek (ie this.seek) instead?
>          // note the sync marker "seen" in the header
>          syncSeen = true;
>          return;
>        }
>
> the problem is that when you sync to the start of a compressed file, the
> noBufferedKeys and valuesDecompressed isn't reset so a block read isn't
> triggered.  When you subsequently call next() you're potentially getting
> keys from the buffer which still contains keys from the previous position
> of the file.
>

--
Jean-Baptiste Onofré
[EMAIL PROTECTED]
http://blog.nanthrax.net
Talend - http://www.talend.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB