Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # user - Re: Not saving any output


Copy link to this message
-
Re: Not saving any output
Kai Voigt 2013-05-28, 20:43
You can have your python streaming script simply not write any key/value pairs to stdout, so you'll get an empty job output.

Independently, your script could do anything external, such as connecting to a remote database and store data in those. You probably want to avoid too many tasks doing this in parallel.

But more common would be a regular job which writes data to HDFS, and then use Sqoop to store that data into a RDBMS. But it's your choice.

Kai

Am 28.05.2013 um 20:57 schrieb jamal sasha <[EMAIL PROTECTED]>:

> Hi,
>   I want to process some text files and then save the output in a db.
> I am using python (hadoop streaming).
> I am using mongo as backend server.
> Is it possible to run hadoop streaming jobs without specifying any output?
> What is the best way to deal with this.
>

--
Kai Voigt
[EMAIL PROTECTED]