Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> HDFS splits based on content semantics

Copy link to this message
HDFS splits based on content semantics

Probably this question is answered many times but I could not clarify yet after searching on google.
Does HDFS split the input solely based on fixed block size or take in consideration the semantics of it ?
For example, if I have a binary file, or I want the block to not cut some lines of text, etc. will I be able to instruct HDFS where to stop with each block ?