SequenceFiles use a 16 digit MD5 (computed based on a UID and writer ~init time, so pretty random). For the rest of my answer, I'll prefer not to repeat what Martin's already said very well here: http://search-hadoop.com/m/VYVra2krg5t1 (point #2) over the Avro lists for the Avro DataFile format which uses a similar technique. On Thu, May 23, 2013 at 11:34 PM, John Lilley <[EMAIL PROTECTED]>wrote:
> How does SequenceFile guarantee that the sync marker does not appear in > the data?**** > > John**** > > ** ** >
-- Harsh J
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by Sematext