Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Using a custom FileSplitter?

Copy link to this message
Using a custom FileSplitter?
Assume I have one of the two situations (I have both)
1) I have a directory with several hundred files - of these some fraction
need to be passed to the mapper (say the ones ending in ".foo") and the
   can be ignored. Assume I am incapable or unwilling to create a directory
containing only the files that I need - how do I set up a custom file
splitter using Java code
   to filter my files.

2) Assume I have a collection of files which are not splittable so I will
use one file per mapper. Assume that special code is required to read the
file and convert it into lines of
 text and that I have Java code to do that. Same question - how do I install
a custom file splitter to decode files in a custom manner?

Steven M. Lewis PhD
Institute for Systems Biology
Seattle WA