Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> How to make zip files as Hadoop input

Copy link to this message
How to make zip files as Hadoop input

I have a bunch of zip files that I want to serve as input to a MapReduce
job. My initial design was to list them in a text file and then give this
list file as input. The list file would be read, and each line would be
handed off to a node to process, which would pick up the corresponding zip
file and work on it.

But I feel that a better design is possible, and that my way is redundant.
Can I just give the input directory as input? How do I make sure each node
gets a file to process?

Thank you,