|
Hemanth Yamijala
2012-10-16, 07:11
sudha sadhasivam
2012-10-16, 08:00
Rahul Patodi
2012-10-16, 10:23
Yanbo Liang
2012-10-16, 10:59
Parth Savani
2012-10-16, 14:32
Parth Savani
2012-10-16, 14:34
Hemanth Yamijala
2012-10-16, 15:10
|
-
Re: problem using s3 instead of hdfsHemanth Yamijala 2012-10-16, 07:11
Hi,
I've not tried this on S3. However, the directory mentioned in the exception is based on the value of this particular configuration key: mapreduce.jobtracker.staging.root.dir. This defaults to ${hadoop.tmp.dir}/mapred/staging. Can you please set this to an S3 location and try ? Thanks Hemanth On Mon, Oct 15, 2012 at 10:43 PM, Parth Savani <[EMAIL PROTECTED]>wrote: > Hello, > I am trying to run hadoop on s3 using distributed mode. However I am > having issues running my job successfully on it. I get the following error > I followed the instructions provided in this article -> > http://wiki.apache.org/hadoop/AmazonS3 > I replaced the fs.default.name value in my hdfs-site.xml to > s3n://ID:SECRET@BUCKET > And I am running my job using the following: hadoop jar > /path/to/my/jar/abcd.jar /input /output > Where */input* is the folder name inside the s3 bucket > (s3n://ID:SECRET@BUCKET/input) > and */output *folder should created in my bucket (s3n://ID:SECRET@BUCKET > /output) > Below is the error i get. It is looking for job.jar on s3 and that path is > on my server from where i am launching my job. > > java.io.FileNotFoundException: No such file or directory > '/opt/data/hadoop/hadoop-mapred/mapred/staging/psavani/.staging/job_201207021606_1036/job.jar' > at > org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:412) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:207) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157) > at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1371) > at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1352) > at > org.apache.hadoop.mapred.JobLocalizer.localizeJobJarFile(JobLocalizer.java:273) > at > org.apache.hadoop.mapred.JobLocalizer.localizeJobFiles(JobLocalizer.java:381) > at > org.apache.hadoop.mapred.JobLocalizer.localizeJobFiles(JobLocalizer.java:371) > at > org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:222) > at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1372) > at java.security.AccessController.doPri > > > > >
-
Re: problem using s3 instead of hdfssudha sadhasivam 2012-10-16, 08:00
Is there a time dealy to fetch information from S3 to hadoop cluster when compared to a regular hadoop cluster setup. Can an elastic block storage be used for this purpose?
G Sudha --- On Tue, 10/16/12, Hemanth Yamijala <[EMAIL PROTECTED]> wrote: From: Hemanth Yamijala <[EMAIL PROTECTED]> Subject: Re: problem using s3 instead of hdfs To: [EMAIL PROTECTED] Date: Tuesday, October 16, 2012, 12:41 PM Hi, I've not tried this on S3. However, the directory mentioned in the exception is based on the value of this particular configuration key: mapreduce.jobtracker.staging.root.dir. This defaults to ${hadoop.tmp.dir}/mapred/staging. Can you please set this to an S3 location and try ? ThanksHemanth On Mon, Oct 15, 2012 at 10:43 PM, Parth Savani <[EMAIL PROTECTED]> wrote: Hello, I am trying to run hadoop on s3 using distributed mode. However I am having issues running my job successfully on it. I get the following error I followed the instructions provided in this article -> http://wiki.apache.org/hadoop/AmazonS3 I replaced the fs.default.name value in my hdfs-site.xml to s3n://ID:SECRET@BUCKETAnd I am running my job using the following: hadoop jar /path/to/my/jar/abcd.jar /input /output Where /input is the folder name inside the s3 bucket (s3n://ID:SECRET@BUCKET/input)and /output folder should created in my bucket (s3n://ID:SECRET@BUCKET/output)Below is the error i get. It is looking for job.jar on s3 and that path is on my server from where i am launching my job. java.io.FileNotFoundException: No such file or directory '/opt/data/hadoop/hadoop-mapred/mapred/staging/psavani/.staging/job_201207021606_1036/job.jar' at org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:412) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:207) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157) at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1371) at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1352) at org.apache.hadoop.mapred.JobLocalizer.localizeJobJarFile(JobLocalizer.java:273) at org.apache.hadoop.mapred.JobLocalizer.localizeJobFiles(JobLocalizer.java:381) at org.apache.hadoop.mapred.JobLocalizer.localizeJobFiles(JobLocalizer.java:371) at org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:222) at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1372) at java.security.AccessController.doPri
-
Re: problem using s3 instead of hdfsRahul Patodi 2012-10-16, 10:23
I think these blog posts will answer your question:
http://www.technology-mania.com/2012/05/s3-instead-of-hdfs-with-hadoop_05.html http://www.technology-mania.com/2011/05/s3-as-input-or-output-for-hadoop-mr.html On Tue, Oct 16, 2012 at 1:30 PM, sudha sadhasivam <[EMAIL PROTECTED] > wrote: > Is there a time dealy to fetch information from S3 to hadoop cluster when > compared to a regular hadoop cluster setup. Can an elastic block storage be > used for this purpose? > G Sudha > > --- On *Tue, 10/16/12, Hemanth Yamijala <[EMAIL PROTECTED]>*wrote: > > > From: Hemanth Yamijala <[EMAIL PROTECTED]> > Subject: Re: problem using s3 instead of hdfs > To: [EMAIL PROTECTED] > Date: Tuesday, October 16, 2012, 12:41 PM > > > Hi, > > I've not tried this on S3. However, the directory mentioned in the > exception is based on the value of this particular configuration > key: mapreduce.jobtracker.staging.root.dir. This defaults > to ${hadoop.tmp.dir}/mapred/staging. Can you please set this to an S3 > location and try ? > > Thanks > Hemanth > > On Mon, Oct 15, 2012 at 10:43 PM, Parth Savani <[EMAIL PROTECTED]<http://mc/compose?[EMAIL PROTECTED]> > > wrote: > > Hello, > I am trying to run hadoop on s3 using distributed mode. However I am > having issues running my job successfully on it. I get the following error > I followed the instructions provided in this article -> > http://wiki.apache.org/hadoop/AmazonS3 > I replaced the fs.default.name value in my hdfs-site.xml to > s3n://ID:SECRET@BUCKET > And I am running my job using the following: hadoop jar > /path/to/my/jar/abcd.jar /input /output > Where */input* is the folder name inside the s3 bucket > (s3n://ID:SECRET@BUCKET/input) > and */output *folder should created in my bucket (s3n://ID:SECRET@BUCKET > /output) > Below is the error i get. It is looking for job.jar on s3 and that path is > on my server from where i am launching my job. > > java.io.FileNotFoundException: No such file or directory > '/opt/data/hadoop/hadoop-mapred/mapred/staging/psavani/.staging/job_201207021606_1036/job.jar' > at > org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:412) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:207) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157) > at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1371) > at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1352) > at > org.apache.hadoop.mapred.JobLocalizer.localizeJobJarFile(JobLocalizer.java:273) > at > org.apache.hadoop.mapred.JobLocalizer.localizeJobFiles(JobLocalizer.java:381) > at > org.apache.hadoop.mapred.JobLocalizer.localizeJobFiles(JobLocalizer.java:371) > at > org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:222) > at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1372) > at java.security.AccessController.doPri > > > > > > -- *Regards*, Rahul Patodi
-
Re: problem using s3 instead of hdfsYanbo Liang 2012-10-16, 10:59
Because you did not set defaultFS in conf, so you need to explicit indicate
the absolute path (include schema) of the file in S3 when you run a MR job. 2012/10/16 Rahul Patodi <[EMAIL PROTECTED]> > I think these blog posts will answer your question: > > > http://www.technology-mania.com/2012/05/s3-instead-of-hdfs-with-hadoop_05.html > > http://www.technology-mania.com/2011/05/s3-as-input-or-output-for-hadoop-mr.html > > > > On Tue, Oct 16, 2012 at 1:30 PM, sudha sadhasivam < > [EMAIL PROTECTED]> wrote: > >> Is there a time dealy to fetch information from S3 to hadoop cluster when >> compared to a regular hadoop cluster setup. Can an elastic block storage be >> used for this purpose? >> G Sudha >> >> --- On *Tue, 10/16/12, Hemanth Yamijala <[EMAIL PROTECTED]>*wrote: >> >> >> From: Hemanth Yamijala <[EMAIL PROTECTED]> >> Subject: Re: problem using s3 instead of hdfs >> To: [EMAIL PROTECTED] >> Date: Tuesday, October 16, 2012, 12:41 PM >> >> >> Hi, >> >> I've not tried this on S3. However, the directory mentioned in the >> exception is based on the value of this particular configuration >> key: mapreduce.jobtracker.staging.root.dir. This defaults >> to ${hadoop.tmp.dir}/mapred/staging. Can you please set this to an S3 >> location and try ? >> >> Thanks >> Hemanth >> >> On Mon, Oct 15, 2012 at 10:43 PM, Parth Savani <[EMAIL PROTECTED]<http://mc/compose?[EMAIL PROTECTED]> >> > wrote: >> >> Hello, >> I am trying to run hadoop on s3 using distributed mode. However I >> am having issues running my job successfully on it. I get the following >> error >> I followed the instructions provided in this article -> >> http://wiki.apache.org/hadoop/AmazonS3 >> I replaced the fs.default.name value in my hdfs-site.xml to >> s3n://ID:SECRET@BUCKET >> And I am running my job using the following: hadoop jar >> /path/to/my/jar/abcd.jar /input /output >> Where */input* is the folder name inside the s3 bucket >> (s3n://ID:SECRET@BUCKET/input) >> and */output *folder should created in my bucket (s3n://ID:SECRET@BUCKET >> /output) >> Below is the error i get. It is looking for job.jar on s3 and that path >> is on my server from where i am launching my job. >> >> java.io.FileNotFoundException: No such file or directory >> '/opt/data/hadoop/hadoop-mapred/mapred/staging/psavani/.staging/job_201207021606_1036/job.jar' >> at >> org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:412) >> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:207) >> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157) >> at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1371) >> at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1352) >> at >> org.apache.hadoop.mapred.JobLocalizer.localizeJobJarFile(JobLocalizer.java:273) >> at >> org.apache.hadoop.mapred.JobLocalizer.localizeJobFiles(JobLocalizer.java:381) >> at >> org.apache.hadoop.mapred.JobLocalizer.localizeJobFiles(JobLocalizer.java:371) >> at >> org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:222) >> at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1372) >> at java.security.AccessController.doPri >> >> >> >> >> >> > > > -- > *Regards*, > Rahul Patodi > > >
-
Re: problem using s3 instead of hdfsParth Savani 2012-10-16, 14:32
Hello Hemanth,
I set the hadoop staging directory to s3 location. However, it complains. Below is the error 12/10/16 10:22:47 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionIdException in thread "main" java.lang.IllegalArgumentException: Wrong FS: s3n://ABCD:ABCD@ABCD/tmp/mapred/staging/psavani1821193643/.staging, expected: file:/// at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:410) at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:322) at org.apache.hadoop.fs.FilterFileSystem.makeQualified(FilterFileSystem.java:79) at org.apache.hadoop.mapred.LocalJobRunner.getStagingAreaDir(LocalJobRunner.java:541) at org.apache.hadoop.mapred.JobClient.getStagingAreaDir(JobClient.java:1204) at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:102) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:839) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833) at org.apache.hadoop.mapreduce.Job.submit(Job.java:476) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506) at com.sensenetworks.macrosensedata.ParseLogsMacrosense.run(ParseLogsMacrosense.java:54) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at com.sensenetworks.macrosensedata.ParseLogsMacrosense.main(ParseLogsMacrosense.java:121) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:197) On Tue, Oct 16, 2012 at 3:11 AM, Hemanth Yamijala <[EMAIL PROTECTED] > wrote: > Hi, > > I've not tried this on S3. However, the directory mentioned in the > exception is based on the value of this particular configuration > key: mapreduce.jobtracker.staging.root.dir. This defaults > to ${hadoop.tmp.dir}/mapred/staging. Can you please set this to an S3 > location and try ? > > Thanks > Hemanth > > > On Mon, Oct 15, 2012 at 10:43 PM, Parth Savani <[EMAIL PROTECTED]>wrote: > >> Hello, >> I am trying to run hadoop on s3 using distributed mode. However I >> am having issues running my job successfully on it. I get the following >> error >> I followed the instructions provided in this article -> >> http://wiki.apache.org/hadoop/AmazonS3 >> I replaced the fs.default.name value in my hdfs-site.xml to >> s3n://ID:SECRET@BUCKET >> And I am running my job using the following: hadoop jar >> /path/to/my/jar/abcd.jar /input /output >> Where */input* is the folder name inside the s3 bucket >> (s3n://ID:SECRET@BUCKET/input) >> and */output *folder should created in my bucket (s3n://ID:SECRET@BUCKET >> /output) >> Below is the error i get. It is looking for job.jar on s3 and that path >> is on my server from where i am launching my job. >> >> java.io.FileNotFoundException: No such file or directory >> '/opt/data/hadoop/hadoop-mapred/mapred/staging/psavani/.staging/job_201207021606_1036/job.jar' >> at >> org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:412) >> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:207) >> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157) >> at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1371) >> at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1352) >> at >> org.apache.hadoop.mapred.JobLocalizer.localizeJobJarFile(JobLocalizer.java:273) >> at >> org.apache.hadoop.mapred.JobLocalizer.localizeJobFiles(JobLocalizer.java:381)
-
Re: problem using s3 instead of hdfsParth Savani 2012-10-16, 14:34
One question,
Can I use both file systems at the same time (hdfs and s3)? According to this link<http://www.mail-archive.com/[EMAIL PROTECTED]/msg03481.html>, I cannot. On Tue, Oct 16, 2012 at 10:32 AM, Parth Savani <[EMAIL PROTECTED]>wrote: > Hello Hemanth, > I set the hadoop staging directory to s3 location. However, it > complains. Below is the error > > 12/10/16 10:22:47 INFO jvm.JvmMetrics: Initializing JVM Metrics with > processName=JobTracker, sessionId> Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: > s3n://ABCD:ABCD@ABCD/tmp/mapred/staging/psavani1821193643/.staging, > expected: file:/// > at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:410) > at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:322) > at > org.apache.hadoop.fs.FilterFileSystem.makeQualified(FilterFileSystem.java:79) > at > org.apache.hadoop.mapred.LocalJobRunner.getStagingAreaDir(LocalJobRunner.java:541) > at > org.apache.hadoop.mapred.JobClient.getStagingAreaDir(JobClient.java:1204) > at > org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:102) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:839) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177) > at > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:476) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506) > at > com.sensenetworks.macrosensedata.ParseLogsMacrosense.run(ParseLogsMacrosense.java:54) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) > at > com.sensenetworks.macrosensedata.ParseLogsMacrosense.main(ParseLogsMacrosense.java:121) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:197) > > > On Tue, Oct 16, 2012 at 3:11 AM, Hemanth Yamijala < > [EMAIL PROTECTED]> wrote: > >> Hi, >> >> I've not tried this on S3. However, the directory mentioned in the >> exception is based on the value of this particular configuration >> key: mapreduce.jobtracker.staging.root.dir. This defaults >> to ${hadoop.tmp.dir}/mapred/staging. Can you please set this to an S3 >> location and try ? >> >> Thanks >> Hemanth >> >> >> On Mon, Oct 15, 2012 at 10:43 PM, Parth Savani <[EMAIL PROTECTED]>wrote: >> >>> Hello, >>> I am trying to run hadoop on s3 using distributed mode. However I >>> am having issues running my job successfully on it. I get the following >>> error >>> I followed the instructions provided in this article -> >>> http://wiki.apache.org/hadoop/AmazonS3 >>> I replaced the fs.default.name value in my hdfs-site.xml to >>> s3n://ID:SECRET@BUCKET >>> And I am running my job using the following: hadoop jar >>> /path/to/my/jar/abcd.jar /input /output >>> Where */input* is the folder name inside the s3 bucket >>> (s3n://ID:SECRET@BUCKET/input) >>> and */output *folder should created in my bucket (s3n://ID:SECRET@BUCKET >>> /output) >>> Below is the error i get. It is looking for job.jar on s3 and that path >>> is on my server from where i am launching my job. >>> >>> java.io.FileNotFoundException: No such file or directory >>> '/opt/data/hadoop/hadoop-mapred/mapred/staging/psavani/.staging/job_201207021606_1036/job.jar' >>> at >>> org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:412) >>> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:207)
-
Re: problem using s3 instead of hdfsHemanth Yamijala 2012-10-16, 15:10
Parth,
I notice in the below stack trace that the LocalJobRunner, instead of the JobTracker is being used. Are you sure this is a distributed cluster ? Could you please check the value of mapred.job.tracker ? Thanks Hemanth On Tue, Oct 16, 2012 at 8:02 PM, Parth Savani <[EMAIL PROTECTED]>wrote: > Hello Hemanth, > I set the hadoop staging directory to s3 location. However, it > complains. Below is the error > > 12/10/16 10:22:47 INFO jvm.JvmMetrics: Initializing JVM Metrics with > processName=JobTracker, sessionId> Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: > s3n://ABCD:ABCD@ABCD/tmp/mapred/staging/psavani1821193643/.staging, > expected: file:/// > at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:410) > at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:322) > at > org.apache.hadoop.fs.FilterFileSystem.makeQualified(FilterFileSystem.java:79) > at > org.apache.hadoop.mapred.LocalJobRunner.getStagingAreaDir(LocalJobRunner.java:541) > at > org.apache.hadoop.mapred.JobClient.getStagingAreaDir(JobClient.java:1204) > at > org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:102) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:839) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177) > at > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:476) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506) > at > com.sensenetworks.macrosensedata.ParseLogsMacrosense.run(ParseLogsMacrosense.java:54) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) > at > com.sensenetworks.macrosensedata.ParseLogsMacrosense.main(ParseLogsMacrosense.java:121) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:197) > > > On Tue, Oct 16, 2012 at 3:11 AM, Hemanth Yamijala < > [EMAIL PROTECTED]> wrote: > >> Hi, >> >> I've not tried this on S3. However, the directory mentioned in the >> exception is based on the value of this particular configuration >> key: mapreduce.jobtracker.staging.root.dir. This defaults >> to ${hadoop.tmp.dir}/mapred/staging. Can you please set this to an S3 >> location and try ? >> >> Thanks >> Hemanth >> >> >> On Mon, Oct 15, 2012 at 10:43 PM, Parth Savani <[EMAIL PROTECTED]>wrote: >> >>> Hello, >>> I am trying to run hadoop on s3 using distributed mode. However I >>> am having issues running my job successfully on it. I get the following >>> error >>> I followed the instructions provided in this article -> >>> http://wiki.apache.org/hadoop/AmazonS3 >>> I replaced the fs.default.name value in my hdfs-site.xml to >>> s3n://ID:SECRET@BUCKET >>> And I am running my job using the following: hadoop jar >>> /path/to/my/jar/abcd.jar /input /output >>> Where */input* is the folder name inside the s3 bucket >>> (s3n://ID:SECRET@BUCKET/input) >>> and */output *folder should created in my bucket (s3n://ID:SECRET@BUCKET >>> /output) >>> Below is the error i get. It is looking for job.jar on s3 and that path >>> is on my server from where i am launching my job. >>> >>> java.io.FileNotFoundException: No such file or directory >>> '/opt/data/hadoop/hadoop-mapred/mapred/staging/psavani/.staging/job_201207021606_1036/job.jar' >>> at >>> org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:412) >>> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:207) |