Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Streaming issue ( URGENT )


Copy link to this message
-
Streaming issue ( URGENT )
Siddharth Tiwari 2012-08-20, 16:33

Hi team,
I have a python script which  normally runs like this locally,
Python mapper.py file1 file2  2 .
How can I achieve this by using streaming API, and using the script as mapper. It actually joins the three files on a column which is passed as parameter ( numeric ) .

Also how can I use paste command in mapper to concatenate three files.
Ex, paste file1 file2 file3 > file4
This is in normal shell,
How to achieve it over streaming.

if possible please explain how can I achive it using multiple mappers and one reducer. It would be great If I could get some examples, tried searching a lot :(

Thanks in advance please help

*------------------------*

Cheers !!!

Siddharth Tiwari

Have a refreshing day !!!
"Every duty is holy, and devotion to duty is the highest form of worship of God.”

"Maybe other people will try to limit me but I don't limit myself"