Great . Thanks alot.
How do I sort the result by score and select top 20 (say)?
On Monday, October 22, 2012, Gunther Hagleitner <[EMAIL PROTECTED]>
> This should work:
> matrix = load 'data1.txt' using PigStorage(',') as (row:chararray,
> column:chararray, value:float);
> vectors = load 'data2.txt' using PigStorage(',') as (user:chararray,
> joined = join vectors by column, matrix by column;
> groups = group joined by (user, row);
> result = foreach groups generate group.user, group.row, (float)
> store result into 'result';
> On Sun, Oct 21, 2012 at 7:40 PM, jamal sasha <[EMAIL PROTECTED]>
>> I am trying to do matrix multiplication using pig.
>> Basically I have data in the form:
>> item1, item3, 0.4
>> item1, item5, 0.6
>> And then I another data in the form
>> Just to give some context.. I am trying to build a top n recommendation
>> system.. which is as follows.
>> Matrix formed by data2.txt
>> item1 item2 item3 item4 item5
>> user1 1 1 0 0 1
>> Matrix formed by data1.txt
>> item1 item2 item 3 item4 item5
>> item1 1 0.3 0.4 0 0.6
>> item2 1
>> item3 1
>> item4 1
>> item5 1
>> So recommendations for user1 would be whether user1 is the score
>> computation as followed
>> Score for user 1 for item 1 = (ignore item1, item1 score) u12* item_12 +
>> u13*item_13 + u14*item14 + u15*item15
>> >> 1 *0.3 + 0*0.4 + 0*0 + 1 * 0.6 = 0.9
>> And then i find this score for user1 and item2
>> And then for user 2 .. item 1 and so on.
>> I understand this is more of an implementation challenge.. and not sure
>> whether this is the right place to ask this.. But any suggestions will be
>> greatly appreciated.