|
|
Harun Raşit ER 2012-03-25, 15:25
public int getPartition(IntWritable key, Chromosome value, int numOfPartitions) { int partition = key.get(); if (partition < 0 || partition >= numOfPartitions) { partition = numOfPartitions-1; } System.out.println("partition "+partition ); return partition; }
I wrote the custom partitioner above. But the problem is about the third parameter, numOfPartitions.
It is always "1" in pseudo-distributed mode. I have 4 mappers and 4 reducers, but only one of the reducers uses the real values. The others yield nothing, just empty files.
When I remove the if statement, hadoop complains about the partition number as "illegal partition for ...".
How can i set the number of partitions in pseudo-distributed mode?
Thanks.
+
Harun Raşit ER 2012-03-25, 15:25
Harsh J 2012-03-25, 17:17
Harun,
Does your map task stdout logs show varying values for "partition"? Seems to me like all your keys are somehow outside of [0, numOfPartitions), and hence go to the last partition, per your logic.
2012/3/25 Harun Raşit ER <[EMAIL PROTECTED]>: > public int getPartition(IntWritable key, Chromosome value, int > numOfPartitions) > { > int partition = key.get(); > if (partition < 0 || partition >= numOfPartitions) > { > �� partition = numOfPartitions-1; > } > System.out.println("partition "+partition ); > return partition; > } > > I wrote the custom partitioner above. But the problem is about the third > parameter, numOfPartitions. > > It is always "1" in pseudo-distributed mode. I have 4 mappers and 4 > reducers, but only one of the reducers uses the real values. The others > yield nothing, just empty files. > > When I remove the if statement, hadoop complains about the partition number > as "illegal partition for ...". > > How can i set the number of partitions in pseudo-distributed mode? > > Thanks.
-- Harsh J
+
Harsh J 2012-03-25, 17:17
Harun Raşit ER 2012-03-26, 09:22
Thanks for your help.
I assigned key values from a static variable and when i ran in eclipse platform, i saw the right key values, but after distributed-mode debug, i have seen all my key values are 0. On 3/25/12, Harsh J <[EMAIL PROTECTED]> wrote: > Harun, > > Does your map task stdout logs show varying values for "partition"? > Seems to me like all your keys are somehow outside of [0, > numOfPartitions), and hence go to the last partition, per your logic. > > 2012/3/25 Harun Raşit ER <[EMAIL PROTECTED]>: >> public int getPartition(IntWritable key, Chromosome value, int >> numOfPartitions) >> { >> int partition = key.get(); >> if (partition < 0 || partition >= numOfPartitions) >> { >> partition = numOfPartitions-1; >> } >> System.out.println("partition "+partition ); >> return partition; >> } >> >> I wrote the custom partitioner above. But the problem is about the third >> parameter, numOfPartitions. >> >> It is always "1" in pseudo-distributed mode. I have 4 mappers and 4 >> reducers, but only one of the reducers uses the real values. The others >> yield nothing, just empty files. >> >> When I remove the if statement, hadoop complains about the partition >> number >> as "illegal partition for ...". >> >> How can i set the number of partitions in pseudo-distributed mode? >> >> Thanks. > > > > -- > Harsh J >
+
Harun Raşit ER 2012-03-26, 09:22
Stan Rosenberg 2012-03-25, 16:51
Typically, numPartitons is used as a modulus in orde to derive a value that is between zero and strictly less than numPartitons. That is, key.get() % numPartitions would yield such a value.
stan On Mar 25, 2012 11:25 AM, "Harun Raşit ER" <[EMAIL PROTECTED]> wrote:
> public int getPartition(IntWritable key, Chromosome value, int > numOfPartitions) > { > int partition = key.get(); > if (partition < 0 || partition >= numOfPartitions) > { > partition = numOfPartitions-1; > } > System.out.println("partition "+partition ); > return partition; > } > > I wrote the custom partitioner above. But the problem is about the third > parameter, numOfPartitions. > > It is always "1" in pseudo-distributed mode. I have 4 mappers and 4 > reducers, but only one of the reducers uses the real values. The others > yield nothing, just empty files. > > When I remove the if statement, hadoop complains about the partition number > as "illegal partition for ...". > > How can i set the number of partitions in pseudo-distributed mode? > > Thanks. >
+
Stan Rosenberg 2012-03-25, 16:51
Harun Raşit ER 2012-03-25, 15:12
My custom parititoner is: public class PopulationPartitioner extends Partitioner <IntWritable, Chromosome> implements Configurable { @Override public int getPartition(IntWritable key, Chromosome value, int numOfPartitions) { int partition = key.get(); if (partition < 0 || partition >= numOfPartitions) { partition = numOfPartitions-1; } System.out.println("partition "+partition ); return partition; }
@Override public Configuration getConf() { // TODO Auto-generated method stub return conf; }
@Override public void setConf(Configuration arg0) { // TODO Auto-generated method stub conf = arg0; } private Configuration conf; }
And my mapred configuration file is :
<configuration> <property> <name>mapred.job.tracker</name> <value>localhost:9001</value> </property> <property> <name>mapred.tasktracker.reduce.tasks.maximum</name> <value>4</value> </property> </configuration>
Thanks again.
---------------------------------------------------------------- This shouldn't be the case at all. Can you share your Partitioner code and the job.xml of the job that showed this behavior?
In any case: How do you "set the numberOfReducer to 4"?
2012/3/23 Harun Raşit ER <[EMAIL PROTECTED]>: > I wrote a custom partitioner. But when I work as standalone or > pseudo-distributed mode, the number of partitions is always 1. I set the > numberOfReducer to 4, but the numOfPartitions parameter of custom > partitioner is still 1 and all my four mappers' results are going to 1 > reducer. The other reducers yield empty files. > > How can i set the number of partitions in standalone or pseudo-distributed > mode? > > thanks for your helps.
-- Harsh J
+
Harun Raşit ER 2012-03-25, 15:12
|
|