Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> How to make zip files as Hadoop input


Copy link to this message
-
How to make zip files as Hadoop input
Hi,

I have a bunch of zip files that I want to serve as input to a MapReduce
job. My initial design was to list them in a text file and then give this
list file as input. The list file would be read, and each line would be
handed off to a node to process, which would pick up the corresponding zip
file and work on it.

But I feel that a better design is possible, and that my way is redundant.
Can I just give the input directory as input? How do I make sure each node
gets a file to process?

Thank you,
Mark
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB