Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Calling sync for every record in sequencefile.writer


Copy link to this message
-
Calling sync for every record in sequencefile.writer
Hello,

For a given part file (e..g part-m-0000), i would like to record the
position of key written to this file.

To get this position, i wrote something
//out.sync()
currentposition=out.getLength();
record_current_position(key, currentposition)
out.append(key, value);

where out is SequenceFile.Writer

Now, if I leave the first line uncommented, for small files, getLength()
does not change from key to key.
if i call sync, for every key, it changes to accurately reflect the
position.
Is there some other function i can use to get the current position (like a
file's 'tell' function)

But calling sync for every record would be costly?

How much?(I dont expect an answer to the last question).
if it makes a difference i have block compression turned on.

I noticed that Mapfile.writer does something similar(calls getLength) and
would reduce to the above operation i.e. call getLength for every key-value
pair if i set the index to 1. So would this impact Mapfile.writer?

Cheers
Sapsi