Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> problem configuring hadoop with s3 bucket


Copy link to this message
-
Re: problem configuring hadoop with s3 bucket
Could you provide your execute environment, operation procedure and the
detail log information?

2012/7/24 Alok Kumar <[EMAIL PROTECTED]>

> Hi Yanbo,
>
> Thank you for your reply..
>
> Now I've made changes exactly , present in this link
> http://wiki.apache.org/hadoop/AmazonS3
>
> But my namenode is not coming up with exception (I tried it both locally
> and inside EC2)
>
> ERROR org.apache.hadoop.hdfs.server.namenode.NameNode:
> java.net.BindException: Problem binding to <bucket-name>.
> s3.amazonaws.com/207.171.163.14:8020 : Cannot assign requested address
>
> fs.default.name = s3://<mybucket>    // Error : UnknownHost Exception in
> Namenode!
> or
> fs.default.name = s3://<mybucket>.s3.amazonaws.com // Error BindException
> in Namenode log!
>
> Also I can't see any dir created inside my bucket. ( I'm able to execute
> command $ bin/hadoop dfs -ls s3://<bucket>/ )
>
> "bin/hadoop namenode -format " is saying succesfully formatted namenode
> dir S3://bucket/hadoop/namenode , when it is not even existing there!
>
>
> any suggestion?
>
> Thanks again.
>
>
> On Tue, Jul 24, 2012 at 4:11 PM, Yanbo Liang <[EMAIL PROTECTED]> wrote:
>
>> I think you have made confusion about the integration of hadoop and S3.
>> 1) If you set "dfs.data.dir=s3://******", it means that you set S3 as the
>> DataNode local storage.
>> You are still using HDFS as the under storage layer.
>> As far as I know, it is not supported at present.
>> 2)  The right way of using S3 integrated with Hadoop is to replace the
>> HDFS with S3 just like you have tried.
>> But I think you have missed some configuration parameters.
>> "  fs.default.name=s3://<<mybucket> " is the most important parameter
>> when you are using S3 to replace HDFS, but it is not enough. The detail
>> configuration can be obtained from here
>> http://wiki.apache.org/hadoop/AmazonS3
>>
>> Yanbo
>>
>>
>> 2012/7/23 Alok Kumar <[EMAIL PROTECTED]>
>>
>>> Hello Group,
>>>
>>> I've hadoop setup locally running.
>>>
>>> Now I want to use Amazon s3://<mybucket> as my data store,
>>> so i changed like " dfs.data.dir=s3://<mybucket>/hadoop/ " in my
>>> hdfs-site.xml, Is it the correct way?
>>> I'm getting error :
>>>
>>> WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid directory
>>> in dfs.data.dir: can not create directory: s3://<mybucket>/hadoop
>>> 2012-07-23 13:15:06,260 ERROR
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: All directories in
>>> dfs.data.dir are invalid.
>>>
>>> and
>>> when i changed like " dfs.data.dir=s3://<mybucket>/ "
>>> I got error :
>>>  ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
>>> java.lang.IllegalArgumentException: Wrong FS: s3://<mybucket>/, expected:
>>> file:///
>>>     at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:381)
>>>     at
>>> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:55)
>>>     at
>>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:393)
>>>     at
>>> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
>>>     at
>>> org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:146)
>>>     at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:162)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1574)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>
>>> Also,
>>> When I'm changing fs.default.name=s3://<<mybucket> , Namenode is not
>>> coming up with error : ERROR
>>> org.apache.hadoop.hdfs.server.namenode.NameNode: java.net.BindException:
>>> (Any way I want to run namenode locally, so I reverted it back to
>>> hdfs://localhost:9000 )
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB