Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Re: Executing a Python program inside Map Function


Copy link to this message
-
Re: Executing a Python program inside Map Function
Java provides the Process class to help you launch and read/write
from/to processes:
http://docs.oracle.com/javase/6/docs/api/java/lang/Process.html. You
can use this to spawn your program from your code, to write input into
the process's stdin, and to read its output via its stdout/etc.. The
hadoop-streaming parts of Apache Hadoop is very similar in its
operations - but allows little control back on the launched java map
class which you seem to require.

The tasks (both M and R types) provide entry and exit API points
(configure()/setup(), and cleanup()) - allowing you to spawn a process
before map-reads start, and end it after, letting you manage your
spawned process more cleanly.

On Sat, Jan 26, 2013 at 12:40 PM, Sundeep Kambhampati
<[EMAIL PROTECTED]> wrote:
> Is it possible to run a python script inside a Map function which is in
> java?
>
> I what to to run a python script which is on my local disk and I want to use
> the output of that script for further processing in Map Function to produce
> <key/Value> Pairs.
> Can some give me some idea how to do it.
>
>
> Regards
> Sundeep

--
Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB