Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Re: best way to join?


+
Mirko Kämpf 2012-09-09, 09:55
+
Ted Dunning 2012-08-28, 20:07
+
dexter morgan 2012-08-30, 09:21
+
Ted Dunning 2012-08-30, 20:05
+
dexter morgan 2012-08-31, 13:03
+
dexter morgan 2012-09-02, 16:26
Copy link to this message
-
Re: best way to join?
On Sun, Sep 2, 2012 at 12:26 PM, dexter morgan <[EMAIL PROTECTED]>wrote:

> ... Either way, any clustering process requires calculating the distance
> of all points (not between all the points, but of all of them to some
> relative point). Because i'll need a clustering MR job, ill probably use
> it, despite as you said, it has high probability to be correct (not 100%)...
>

This is probably right as stated, but I think that there is confusion here.

Many people assume that each point in the training data has to have
distance computed to all centroids in the clustering.  Even this is not
true.

It is true that you have to compute distance to at least one something, but
not necessarily to all of the clusters.
+
dexter morgan 2012-08-27, 20:15
+
Björn-Elmar Macek 2012-09-04, 08:17
+
Ted Dunning 2012-08-27, 21:52
+
dexter morgan 2012-08-28, 13:48