-How to split a specified number of rows per Map
edward choi 2011-06-05, 11:04
I am using HBase as a source of my MapReduce jobs.
I recently found out that TableInputFormat automatically splits the input
table so that each region of the table will be assigned to a single Map job.
But what I want to do is to split the input table so that user-specified
lines of row will be assigned to each Mapper.
For example, if I set a certain parameter to 100, then each Mapper will get
100 lines from the input Table.
Is there a method for this kind of operation?
Or do I have to modify the getSplits() of
Any answer or opinion will be much appreciated!!