Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> PigStorage's handling of InputFormat and OutputFormat


Copy link to this message
-
PigStorage's handling of InputFormat and OutputFormat
expectation from PigStorage.getInputFormat()  is that it is a
InputFormat<Writable, Text>, and PigStorage handles converting Text to
Tuple.
This is very useful and easy for users to use some other input format.

But the same is not true for PigStorage().getOutputFormat().. Here it
expects OutputFormat<Writable, Tuple>. So the output format needs to convert
Tuple to Text().

Not sure if this is intentional or not. I can submit a patch to move Tuple
handling into PigStorage. Then PigTextOutputFormat would be as thin as
PigTextInputFormat.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB