Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - Non utf-8 chars in input


Copy link to this message
-
Non utf-8 chars in input
Ajay Srivastava 2012-09-11, 05:54
Hi,

I am using default inputFormat class for reading input from text files but the input file has some non utf-8 characters.
I guess that TextInputFormat class is default inputFormat class and it replaces these non utf-8 chars by "\uFFFD". If I do not want this behavior and need actual char in my mapper what should be the correct inputFormat class ?

Regards,
Ajay Srivastava