Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Binary Files With No Record Begin and End


Copy link to this message
-
Binary Files With No Record Begin and End
MJ Sam 2012-07-05, 18:54
Hi,

The input of my map reduce is a binary file with no record begin and
end marker. The only thing is that each record is a fixed 180bytes
size in the binary file. How do I make Hadoop to properly find the
record in the splits when a record overlap two splits. I was thinking
to make the splits size to be a multiple of 180 but was wondering if
there is anything else that I can do?  Please note that my files are
not sequence file and just a custom binary file.