-Re: hadoop streaming using a java program as mapper
Robert Evans 2012-05-02, 13:40
Do you have the error message from running java? You can use myMapper.sh to help you debug what is happening and logging it. Stderr of myMapper.sh is logged and you can get to it. You can run shell commands link find, ls, and you can probably look at any error messages that java produced while trying to run. Things like class not found exceptions.
On 5/2/12 12:25 AM, "Boyu Zhang" <[EMAIL PROTECTED]> wrote:
Yes, I did, the myMapper.sh is executed, the problem is inside this
myMapper.sh, it calls a java program named myJava, the myJava did not get
executed on slaves, and I shipped myJava.class too.
On Wed, May 2, 2012 at 1:20 AM, 黄 山 <[EMAIL PROTECTED]> wrote:
> have you shipped myMapper.sh to each node?
> [EMAIL PROTECTED]
> 在 2012-5-2，下午1:17， Boyu Zhang 写峠�：
> > Hi All,
> > I am in a little bit strange situation, I am using Hadoop streaming to
> > a bash shell program myMapper.sh, and in the myMapper.sh, it calls a java
> > program, then a R program, then output intermediate key, values. I used
> > -file option to ship the java and R files, but the java program was not
> > executed by the streaming. The myMapper.sh has something like this:
> > java myJava arguments
> > And in the streaming command, I use something like this:
> > hadoop jar /opt/hadoop/hadoop-0.20.2-streaming.jar -D
> > -input /user/input -output /user/output7 -mapper ./myMapper.sh -file
> > myJava.class -verbose
> > And the myJava program is not run when I execute like this, and if I go
> > the actual slave node to check the files, the myMapper.sh is shipped to
> > slave node, but the myJava.class is not, it is inside the job.jar file.
> > Can someone provide some insights on how to run a java program through
> > hadoop streaming? Thanks!
> > Boyu