Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Using S3 instead of HDFS


Copy link to this message
-
Re: Using S3 instead of HDFS
It worked, thank you, Harsh.

Mark

On Wed, Jan 18, 2012 at 1:16 AM, Harsh J <[EMAIL PROTECTED]> wrote:

> Ah sorry about missing that. Settings would go in core-site.xml
> (hdfs-site.xml will no longer be relevant anymore, once you switch to using
> S3).
>
> On 18-Jan-2012, at 12:36 PM, Mark Kerzner wrote:
>
> > That wiki page mentiones hadoop-site.xml, but this is old, now you have
> > core-site.xml and hdfs-site.xml, so which one do you put it in?
> >
> > Thank you (and good night Central Time:)
> >
> > mark
> >
> > On Wed, Jan 18, 2012 at 12:52 AM, Harsh J <[EMAIL PROTECTED]> wrote:
> >
> >> When using S3 you do not need to run any component of HDFS at all. It
> >> is meant to be an alternate FS choice. You need to run only MR.
> >>
> >> The wiki page at http://wiki.apache.org/hadoop/AmazonS3 mentions on
> >> how to go about specifying your auth details to S3, either directly
> >> via the fs.default.name URI or via the additional properties
> >> fs.s3.awsAccessKeyId and fs.s3.awsSecretAccessKey. Does this not work
> >> for you?
> >>
> >> On Wed, Jan 18, 2012 at 12:14 PM, Mark Kerzner <
> [EMAIL PROTECTED]>
> >> wrote:
> >>> Well, here is my error message
> >>>
> >>> Starting Hadoop namenode daemon: starting namenode, logging to
> >>> /usr/lib/hadoop-0.20/logs/hadoop-hadoop-namenode-ip-10-126-11-26.out
> >>> ERROR. Could not start Hadoop namenode daemon
> >>> Starting Hadoop secondarynamenode daemon: starting secondarynamenode,
> >>> logging to
> >>>
> >>
> /usr/lib/hadoop-0.20/logs/hadoop-hadoop-secondarynamenode-ip-10-126-11-26.out
> >>> Exception in thread "main" java.lang.IllegalArgumentException: Invalid
> >> URI
> >>> for NameNode address (check fs.default.name): s3n://myname.testdata is
> >> not
> >>> of scheme 'hdfs'.
> >>>       at
> >>>
> >>
> org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:224)
> >>>       at
> >>>
> >>
> org.apache.hadoop.hdfs.server.namenode.NameNode.getServiceAddress(NameNode.java:209)
> >>>       at
> >>>
> >>
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:182)
> >>>       at
> >>>
> >>
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.<init>(SecondaryNameNode.java:150)
> >>>       at
> >>>
> >>
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:624)
> >>> ERROR. Could not start Hadoop secondarynamenode daemon
> >>>
> >>> And, if I don't need to start the NameNode, then where do I give the S3
> >>> credentials?
> >>>
> >>> Thank you,
> >>> Mark
> >>>
> >>>
> >>> On Wed, Jan 18, 2012 at 12:36 AM, Harsh J <[EMAIL PROTECTED]> wrote:
> >>>
> >>>> Hey Mark,
> >>>>
> >>>> What is the exact trouble you run into? What do the error messages
> >>>> indicate?
> >>>>
> >>>> This should be definitive enough I think:
> >>>> http://wiki.apache.org/hadoop/AmazonS3
> >>>>
> >>>> On Wed, Jan 18, 2012 at 11:55 AM, Mark Kerzner <
> >> [EMAIL PROTECTED]>
> >>>> wrote:
> >>>>> Hi,
> >>>>>
> >>>>> whatever I do, I can't make it work, that is, I cannot use
> >>>>>
> >>>>> s3://host
> >>>>>
> >>>>> or s3n://host
> >>>>>
> >>>>> as a replacement for HDFS while runnings EC2 cluster. I change the
> >>>> settings
> >>>>> in the core-file.xml, in hdfs-site.xml, and start hadoop services,
> >> and it
> >>>>> fails with error messages.
> >>>>>
> >>>>> Is there a place where this is clearly described?
> >>>>>
> >>>>> Thank you so much.
> >>>>>
> >>>>> Mark
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Harsh J
> >>>> Customer Ops. Engineer, Cloudera
> >>>>
> >>
> >>
> >>
> >> --
> >> Harsh J
> >> Customer Ops. Engineer, Cloudera
> >>
>
> --
> Harsh J
> Customer Ops. Engineer, Cloudera
>
>