Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> How to use CombineFileInputFormat in Hadoop?


Copy link to this message
-
How to use CombineFileInputFormat in Hadoop?
Gentles,

I want to use the CombineFileInputFormat of Hadoop 0.20.0 / 0.20.2 such
that it processes 1 file per record and also doesn't compromise on data -
locality (which it normally takes care of).

It is mentioned in Tom White's Hadoop Definitive Guide but he has not shown
how to do it. Instead, he moves on to Sequence Files.

I am pretty confused on what is the meaning of processed variable in a
record reader. Any code example would be of tremendous help.

Thanks in advance..
Cheers!
Manoj.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB