On Thu, May 2, 2013 at 2:03 PM, Chengi Liu <[EMAIL PROTECTED]> wrote: > Hi, > I am using hadoop streaming api (python) for some processing. > While I want the data to be processed via hadoop but I want to pipe it to db > instead of hdfs. > How do I do this? > THanks
Or even use Hive-JDBC to connect to your result data from outside the hadoop cluster.
You can also create your own OutputFormat (with Java API), which writes data directly to the database, but be careful with large result sets or even with a large number of reducers. This could be a scalability issue, but a small dataset coming out from one reducer can be handled that way.