Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Re: Executing a Python program inside Map Function

Copy link to this message
Re: Executing a Python program inside Map Function
Java provides the Process class to help you launch and read/write
from/to processes:
http://docs.oracle.com/javase/6/docs/api/java/lang/Process.html. You
can use this to spawn your program from your code, to write input into
the process's stdin, and to read its output via its stdout/etc.. The
hadoop-streaming parts of Apache Hadoop is very similar in its
operations - but allows little control back on the launched java map
class which you seem to require.

The tasks (both M and R types) provide entry and exit API points
(configure()/setup(), and cleanup()) - allowing you to spawn a process
before map-reads start, and end it after, letting you manage your
spawned process more cleanly.

On Sat, Jan 26, 2013 at 12:40 PM, Sundeep Kambhampati
> Is it possible to run a python script inside a Map function which is in
> java?
> I what to to run a python script which is on my local disk and I want to use
> the output of that script for further processing in Map Function to produce
> <key/Value> Pairs.
> Can some give me some idea how to do it.
> Regards
> Sundeep

Harsh J