Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> problem configuring hadoop with s3 bucket


+
Alok Kumar 2012-07-23, 08:26
+
Yanbo Liang 2012-07-24, 10:41
+
Alok Kumar 2012-07-24, 12:16
Copy link to this message
-
Re: problem configuring hadoop with s3 bucket
Could you provide your execute environment, operation procedure and the
detail log information?

2012/7/24 Alok Kumar <[EMAIL PROTECTED]>

> Hi Yanbo,
>
> Thank you for your reply..
>
> Now I've made changes exactly , present in this link
> http://wiki.apache.org/hadoop/AmazonS3
>
> But my namenode is not coming up with exception (I tried it both locally
> and inside EC2)
>
> ERROR org.apache.hadoop.hdfs.server.namenode.NameNode:
> java.net.BindException: Problem binding to <bucket-name>.
> s3.amazonaws.com/207.171.163.14:8020 : Cannot assign requested address
>
> fs.default.name = s3://<mybucket>    // Error : UnknownHost Exception in
> Namenode!
> or
> fs.default.name = s3://<mybucket>.s3.amazonaws.com // Error BindException
> in Namenode log!
>
> Also I can't see any dir created inside my bucket. ( I'm able to execute
> command $ bin/hadoop dfs -ls s3://<bucket>/ )
>
> "bin/hadoop namenode -format " is saying succesfully formatted namenode
> dir S3://bucket/hadoop/namenode , when it is not even existing there!
>
>
> any suggestion?
>
> Thanks again.
>
>
> On Tue, Jul 24, 2012 at 4:11 PM, Yanbo Liang <[EMAIL PROTECTED]> wrote:
>
>> I think you have made confusion about the integration of hadoop and S3.
>> 1) If you set "dfs.data.dir=s3://******", it means that you set S3 as the
>> DataNode local storage.
>> You are still using HDFS as the under storage layer.
>> As far as I know, it is not supported at present.
>> 2)  The right way of using S3 integrated with Hadoop is to replace the
>> HDFS with S3 just like you have tried.
>> But I think you have missed some configuration parameters.
>> "  fs.default.name=s3://<<mybucket> " is the most important parameter
>> when you are using S3 to replace HDFS, but it is not enough. The detail
>> configuration can be obtained from here
>> http://wiki.apache.org/hadoop/AmazonS3
>>
>> Yanbo
>>
>>
>> 2012/7/23 Alok Kumar <[EMAIL PROTECTED]>
>>
>>> Hello Group,
>>>
>>> I've hadoop setup locally running.
>>>
>>> Now I want to use Amazon s3://<mybucket> as my data store,
>>> so i changed like " dfs.data.dir=s3://<mybucket>/hadoop/ " in my
>>> hdfs-site.xml, Is it the correct way?
>>> I'm getting error :
>>>
>>> WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid directory
>>> in dfs.data.dir: can not create directory: s3://<mybucket>/hadoop
>>> 2012-07-23 13:15:06,260 ERROR
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: All directories in
>>> dfs.data.dir are invalid.
>>>
>>> and
>>> when i changed like " dfs.data.dir=s3://<mybucket>/ "
>>> I got error :
>>>  ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
>>> java.lang.IllegalArgumentException: Wrong FS: s3://<mybucket>/, expected:
>>> file:///
>>>     at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:381)
>>>     at
>>> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:55)
>>>     at
>>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:393)
>>>     at
>>> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
>>>     at
>>> org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:146)
>>>     at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:162)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1574)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>
>>> Also,
>>> When I'm changing fs.default.name=s3://<<mybucket> , Namenode is not
>>> coming up with error : ERROR
>>> org.apache.hadoop.hdfs.server.namenode.NameNode: java.net.BindException:
>>> (Any way I want to run namenode locally, so I reverted it back to
>>> hdfs://localhost:9000 )
+
Alok Kumar 2012-07-25, 07:55
+
Yanbo Liang 2012-07-26, 07:20