Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - PigStorage's handling of InputFormat and OutputFormat


Copy link to this message
-
PigStorage's handling of InputFormat and OutputFormat
Raghu Angadi 2011-07-21, 18:12
expectation from PigStorage.getInputFormat()  is that it is a
InputFormat<Writable, Text>, and PigStorage handles converting Text to
Tuple.
This is very useful and easy for users to use some other input format.

But the same is not true for PigStorage().getOutputFormat().. Here it
expects OutputFormat<Writable, Tuple>. So the output format needs to convert
Tuple to Text().

Not sure if this is intentional or not. I can submit a patch to move Tuple
handling into PigStorage. Then PigTextOutputFormat would be as thin as
PigTextInputFormat.