Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Re: Can't get a streaming job to use a custom partitioner


Copy link to this message
-
Re: Can't get a streaming job to use a custom partitioner
1) Could you print the output of:
$ jar tf ./NumericPartitioner.jar

2) Could you try:
$ export HADOOP_CLASSPATH="$HADOOP_CLASSPATH:./NumericPartitioner.jar"

$ ../bin/hadoop jar ../contrib/streaming/hadoop-streaming-1.2.1.jar \
-libjars ./NumericPartitioner.jar \
-input /input -output /output/keys -mapper "map_threeJoin.py" -reducer
"keycount.py" \
-partitioner newjoin.NumericPartitioner -file "map_threeJoin.py" -file
"keycount.py"

2013/11/18 Ben K <[EMAIL PROTECTED]>

> I need help. No matter what I do I can't seem to get Hadoop to find my
> custom partitioner.
> Here is the command I am running:
>
> ../bin/hadoop jar ../contrib/streaming/hadoop-streaming-1.2.1.jar \
> -libjars ./NumericPartitioner.jar \
> -input /input -output /output/keys -mapper "map_threeJoin.py" -reducer
> "keycount.py" \
> -partitioner newjoin.NumericPartitioner -file "map_threeJoin.py" -file
> "keycount.py"
>
> (The code of NumericPartitioner is very simple, and is here:
> http://pastebin.com/ZEK7N1RN)
> But no matter what I do, it gives:
>
> -partitioner : class not found : newjoin.NumericPartitioner
>
> Does anyone have any idea why it might be going wrong?
>
> Ben K
>
>
>