Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> problem configuring hadoop with s3 bucket


Copy link to this message
-
Re: problem configuring hadoop with s3 bucket
I think you have made confusion about the integration of hadoop and S3.
1) If you set "dfs.data.dir=s3://******", it means that you set S3 as the
DataNode local storage.
You are still using HDFS as the under storage layer.
As far as I know, it is not supported at present.
2)  The right way of using S3 integrated with Hadoop is to replace the HDFS
with S3 just like you have tried.
But I think you have missed some configuration parameters.
"  fs.default.name=s3://<<mybucket> " is the most important parameter when
you are using S3 to replace HDFS, but it is not enough. The detail
configuration can be obtained from here
http://wiki.apache.org/hadoop/AmazonS3

Yanbo

2012/7/23 Alok Kumar <[EMAIL PROTECTED]>

> Hello Group,
>
> I've hadoop setup locally running.
>
> Now I want to use Amazon s3://<mybucket> as my data store,
> so i changed like " dfs.data.dir=s3://<mybucket>/hadoop/ " in my
> hdfs-site.xml, Is it the correct way?
> I'm getting error :
>
> WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid directory in
> dfs.data.dir: can not create directory: s3://<mybucket>/hadoop
> 2012-07-23 13:15:06,260 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: All directories in
> dfs.data.dir are invalid.
>
> and
> when i changed like " dfs.data.dir=s3://<mybucket>/ "
> I got error :
>  ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
> java.lang.IllegalArgumentException: Wrong FS: s3://<mybucket>/, expected:
> file:///
>     at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:381)
>     at
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:55)
>     at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:393)
>     at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
>     at
> org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:146)
>     at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:162)
>     at
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1574)
>     at
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>     at
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>     at
> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>     at
> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>
> Also,
> When I'm changing fs.default.name=s3://<<mybucket> , Namenode is not
> coming up with error : ERROR
> org.apache.hadoop.hdfs.server.namenode.NameNode: java.net.BindException:
> (Any way I want to run namenode locally, so I reverted it back to
> hdfs://localhost:9000 )
>
> Your help is highly appreciated!
> Thanks
> --
> Alok Kumar
>