Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> SequenceFile sync marker uniqueness

Copy link to this message
Re: SequenceFile sync marker uniqueness
SequenceFiles use a 16 digit MD5 (computed based on a UID and writer ~init
time, so pretty random). For the rest of my answer, I'll prefer not to
repeat what Martin's already said very well here:
http://search-hadoop.com/m/VYVra2krg5t1 (point #2) over the Avro lists for
the Avro DataFile format which uses a similar technique.
On Thu, May 23, 2013 at 11:34 PM, John Lilley <[EMAIL PROTECTED]>wrote:

>  How does SequenceFile guarantee that the sync marker does not appear in
> the data?****
> John****
> ** **

Harsh J