Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Re: Not saving any output


Copy link to this message
-
Re: Not saving any output
You can have your python streaming script simply not write any key/value pairs to stdout, so you'll get an empty job output.

Independently, your script could do anything external, such as connecting to a remote database and store data in those. You probably want to avoid too many tasks doing this in parallel.

But more common would be a regular job which writes data to HDFS, and then use Sqoop to store that data into a RDBMS. But it's your choice.

Kai

Am 28.05.2013 um 20:57 schrieb jamal sasha <[EMAIL PROTECTED]>:

> Hi,
>   I want to process some text files and then save the output in a db.
> I am using python (hadoop streaming).
> I am using mongo as backend server.
> Is it possible to run hadoop streaming jobs without specifying any output?
> What is the best way to deal with this.
>

--
Kai Voigt
[EMAIL PROTECTED]
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB