-HDFS splits based on content semantics
Grandl Robert 2012-08-01, 13:44
Probably this question is answered many times but I could not clarify yet after searching on google.
Does HDFS split the input solely based on fixed block size or take in consideration the semantics of it ?
For example, if I have a binary file, or I want the block to not cut some lines of text, etc. will I be able to instruct HDFS where to stop with each block ?