(2) Running R from a mapreduce job ? Without much extra ceremony, for the latter, you could use either MapReduce streaming or pig to call a custom program, as long as R is installed on every node of the cluster itself
On Wed, Mar 26, 2014 at 6:39 AM, Saravanan Nagarajan < [EMAIL PROTECTED]> wrote:
Below is my understanding of Hadoop+R environment.
1. R contain Many data mining algorithm, to re-use this we have many tools like RHIPE,RHAdoop,etc 2.This tools will convert R algorithm and run in Hadoop map Reduce using RMR,But i am not sure whether it will work for all algorithms in R.
Please let me know if you have any other points.
Thanks, Saravanan linkedin.com/in/saravanan303 On Wed, Mar 26, 2014 at 5:35 PM, Jay Vyas <[EMAIL PROTECTED]> wrote:
Try OpenSource h2o.ai - a cran-style package that allows fast & scalable R on Hadoop in-Memory. One can invoke single threaded R from h2o package and the runtime on clusters is Java (not R!) - So you get better memory management.