

Re: providing the same input to more than one Map taskTed Dunning 20110423, 01:33
I would recommend taking this question to the Mahout mailing list.
The short answer is that matrix multiplication by a column vector is pretty easy. Each mapper reads the vector in the configure method and then does a dot product for each row of the input matrix. Results are reassembled into a vector in the reducer. Mahout has special matrix structures to help with this. On Fri, Apr 22, 2011 at 2:59 PM, Mehmet Tepedelenlioglu < [EMAIL PROTECTED]> wrote: > There is a way: > > > http://hadoop.apache.org/common/docs/r0.18.3/mapred_tutorial.html#DistributedCache > > Are you working with a sparse matrix, or a full one? > > > On Apr 22, 2011, at 2:33 PM, aanghelescu wrote: > > > > > Hi all, > > > > I am trying to perform matrixvector multiplication using Hadoop. > > > > So I have matrix M in a file, and vector v in another file. Obviously, > files > > are of different sizes. Is it possible to make it so that each Map task > will > > get the whole vector v and a chunk of matrix M? I know how my map and > reduce > > functions should look like, but I don't know how to format the input. > > > > Basically I want my map function to output keyvalue pairs > (i,m[i,j]*v(j)), > > where i is the row number, and j the column number; v(j) is the jth > element > > in v. And the reduce function will sum up all the values with the same > key  > > i, and that will be the ith element of my result vector. > > > > Or can you suggest another way to do it? > > > > Thanks, > > Alexandra > >  > > View this message in context: > http://old.nabble.com/providingthesameinputtomorethanoneMaptasktp31459012p31459012.html > > Sent from the Hadoop coreuser mailing list archive at Nabble.com. > > > > 
