Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - question about machine learning on Hive

Copy link to this message
Re: question about machine learning on Hive
Igor Tatarinov 2013-01-17, 21:29
Here is how Twitter does it with Pig:

We use a similar approach and I think that Pig, being somewhat lower-level
with better support of nested objects, is a better tool than Hive. It
should be possible to do something similar with Hive but we haven't tried.
The trick is to implement the learner as a serializer. Then, the number of
reducers will determine how many parallel learners (bags) you can run.


On Thu, Jan 17, 2013 at 1:23 PM, qiaoresearcher <[EMAIL PROTECTED]>wrote:

> How to run machine learning algorithms (whatever ML algorithms) directly
> in Hive? assume the input and output already stored as Hive tables.
> ps: I know mahout is available there, but would prefer run machine
> learning algorithms directly in Hive
> many thanks,