Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> matrix multiplication


Copy link to this message
-
matrix multiplication
Hi,
  I am trying to do matrix multiplication using pig.

Basically I have data in the form:
data1.txt
item1,item2,0.3
item1, item3, 0.4
item1, item5, 0.6

And then I another data in the form
data2.txt
user1,item1
user1,item2
user1,item5
...
user2,item2
etc

Just to give some context.. I am trying to build a top n recommendation
system.. which is as follows.
Matrix formed by data2.txt
          item1   item2    item3    item4   item5
user1   1           1           0          0          1
Matrix formed by data1.txt

            item1       item2        item 3      item4      item5
item1       1            0.3           0.4             0           0.6
item2                       1
item3                                     1
item4                                                      1
item5                                                                   1
So recommendations for user1 would be whether user1 is the score
computation as followed
Score for user 1 for item 1 = (ignore item1, item1 score) u12* item_12 +
u13*item_13 + u14*item14 + u15*item15

                                               1 *0.3        +    0*0.4   +  0*0   + 1 * 0.6 = 0.9

And then i find this score for user1 and item2

And then for user 2 .. item 1 and so on.

I understand this is more of an implementation challenge.. and not sure
whether this is the right place to ask this.. But any suggestions will be
greatly appreciated.
Thanks
Jamal
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB