|
Alok Kumar
2012-08-02, 07:14
Harsh J
2012-08-02, 11:52
Alok Kumar
2012-08-02, 13:01
Harsh J
2012-08-02, 17:48
Alok Kumar
2012-08-03, 07:26
Harsh J
2012-08-03, 07:30
|
-
Hadoop with S3 instead of local storageAlok Kumar 2012-08-02, 07:14
Hi,
Followed instructions from this link for setup http://wiki.apache.org/hadoop/AmazonS3. my "core-site.xml " contains only these 3 properties : <property> <name>fs.default.name</name> <value>s3://BUCKET</value> </property> <property> <name>fs.s3.awsAccessKeyId</name> <value>ID</value> </property> <property> <name>fs.s3.awsSecretAccessKey</name> <value>SECRET</value> </property> hdfs-site.xml is empty! Namenode log says, its trying to connect to local HDFS not S3. Am i missing anything? Regards, Alok +
Alok Kumar 2012-08-02, 07:14
-
Re: Hadoop with S3 instead of local storageHarsh J 2012-08-02, 11:52
With S3 you do not need a NameNode. NameNode is part of HDFS.
On Thu, Aug 2, 2012 at 12:44 PM, Alok Kumar <[EMAIL PROTECTED]> wrote: > Hi, > > Followed instructions from this link for setup > http://wiki.apache.org/hadoop/AmazonS3. > > my "core-site.xml " contains only these 3 properties : > <property> > <name>fs.default.name</name> > <value>s3://BUCKET</value> > </property> > > <property> > <name>fs.s3.awsAccessKeyId</name> > <value>ID</value> > </property> > > <property> > <name>fs.s3.awsSecretAccessKey</name> > <value>SECRET</value> > </property> > > hdfs-site.xml is empty! > > Namenode log says, its trying to connect to local HDFS not S3. > Am i missing anything? > > Regards, > Alok -- Harsh J +
Harsh J 2012-08-02, 11:52
-
Re: Hadoop with S3 instead of local storageAlok Kumar 2012-08-02, 13:01
Hi,
Thank you for reply. Requirement is that I need to setup a hadoop cluster using s3 as a backup (performance won't be an issue) My Architecture is like : Hive has external table mapped to HBase. HBase is storing data to HDFS. Hive is using Hadoop to access HBase table data. Can I make this work using S3? HBase regionserver is failing with Error "Caused by: java.lang.ClassNotFoundException: org.jets3t.service.S3ServiceException" HBase master log has lots of "Unexpected response code 404, expected 200" Do I need to start DataNode with s3? Datanode log says : 2012-08-02 17:50:20,021 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting DataNode STARTUP_MSG: host = datarpm-desktop/192.168.2.4 STARTUP_MSG: args = [] STARTUP_MSG: version = 1.0.1 STARTUP_MSG: build https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1243785; compiled by 'hortonfo' on Tue Feb 14 08:15:38 UTC 2012 ************************************************************/ 2012-08-02 17:50:20,145 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2012-08-02 17:50:20,156 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered. 2012-08-02 17:50:20,157 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2012-08-02 17:50:20,157 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started 2012-08-02 17:50:20,277 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered. 2012-08-02 17:50:20,281 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists! 2012-08-02 17:50:20,317 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library 2012-08-02 17:50:22,006 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call to <bucket-name>/67.215.65.132:8020 failed on local exception: java.io.EOFException at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103) at org.apache.hadoop.ipc.Client.call(Client.java:1071) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) at $Proxy5.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:370) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:429) at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:331) at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:296) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:356) at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:800) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:745) 2012-08-02 17:50:22,007 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at datarpm-desktop/192.168.2.4 Thanks, On Thu, Aug 2, 2012 at 5:22 PM, Harsh J <[EMAIL PROTECTED]> wrote: > With S3 you do not need a NameNode. NameNode is part of HDFS. > > On Thu, Aug 2, 2012 at 12:44 PM, Alok Kumar <[EMAIL PROTECTED]> wrote: > > Hi, > > > > Followed instructions from this link for setup > > http://wiki.apache.org/hadoop/AmazonS3. Alok +
Alok Kumar 2012-08-02, 13:01
-
Re: Hadoop with S3 instead of local storageHarsh J 2012-08-02, 17:48
Alok,
HDFS is a FileSystem. S3 is also a FileSystem. Hence when you choose to use S3 on a node, do not attempt to start HDFS services such as NameNode and DataNode. They have nothing to do with S3. S3 stands alone and its configuration points to where it is running / how it is to be accessed / etc.. For S3 to be available, the S3's jars should be made available in services you wish to use it in. Yes you can make Hive/HBase work with S3, if S3 is configured as the fs.default.name (or fs.defaultFS in 2.x+). You can configure your core-site.xml with the right FS, and run regular "hadoop fs -ls /", etc. commands against that FS. The library is jets3t: http://jets3t.s3.amazonaws.com/downloads.html and you'll need its jar on HBase/Hive/etc. classpaths. Let us know if this clears it up! On Thu, Aug 2, 2012 at 6:31 PM, Alok Kumar <[EMAIL PROTECTED]> wrote: > Hi, > > Thank you for reply. > > Requirement is that I need to setup a hadoop cluster using s3 as a backup > (performance won't be an issue) > > My Architecture is like : > Hive has external table mapped to HBase. HBase is storing data to HDFS. > Hive is using Hadoop to access HBase table data. > Can I make this work using S3? > > HBase regionserver is failing with Error "Caused by: > java.lang.ClassNotFoundException: org.jets3t.service.S3ServiceException" > > HBase master log has lots of "Unexpected response code 404, expected 200" > > Do I need to start DataNode with s3? > Datanode log says : > > 2012-08-02 17:50:20,021 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: > /************************************************************ > STARTUP_MSG: Starting DataNode > STARTUP_MSG: host = datarpm-desktop/192.168.2.4 > STARTUP_MSG: args = [] > STARTUP_MSG: version = 1.0.1 > STARTUP_MSG: build > https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r > 1243785; compiled by 'hortonfo' on Tue Feb 14 08:15:38 UTC 2012 > ************************************************************/ > 2012-08-02 17:50:20,145 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: > loaded properties from hadoop-metrics2.properties > 2012-08-02 17:50:20,156 INFO > org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source > MetricsSystem,sub=Stats registered. > 2012-08-02 17:50:20,157 INFO > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period > at 10 second(s). > 2012-08-02 17:50:20,157 INFO > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system > started > 2012-08-02 17:50:20,277 INFO > org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi > registered. > 2012-08-02 17:50:20,281 WARN > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already > exists! > 2012-08-02 17:50:20,317 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded > the native-hadoop library > 2012-08-02 17:50:22,006 ERROR > org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call > to <bucket-name>/67.215.65.132:8020 failed on local exception: > java.io.EOFException > at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103) > at org.apache.hadoop.ipc.Client.call(Client.java:1071) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) > at $Proxy5.getProtocolVersion(Unknown Source) > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396) > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:370) > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:429) > at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:331) > at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:296) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:356) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539) Harsh J +
Harsh J 2012-08-02, 17:48
-
Re: Hadoop with S3 instead of local storageAlok Kumar 2012-08-03, 07:26
Thank you Harsh.
That clears my doubt for Hadoop with S3. Q. Does HBase communicate with S3 directly without using Hadoop? I've put this task aside for a while..! ..will post again. I've not make it working yet. "jets3t jar" is present in classpath. Thanks, Alok HMaster is running .. Regionserver log : 2012-08-03 12:42:40,576 WARN org.jets3t.service.impl.rest.httpclient.RestS3Service: Response '/%2Fhbase%2F.logs%2Fslave-1%2C60020%2C1343977957962' - Unexpected response code 404, expected 200 2012-08-03 12:42:40,576 WARN org.jets3t.service.impl.rest.httpclient.RestS3Service: Response '/%2Fhbase%2F.logs%2Fslave-1%2C60020%2C1343977957962' - Received error response with XML message 2012-08-03 12:42:43,063 WARN org.jets3t.service.impl.rest.httpclient.RestS3Service: Response '/%2Fhbase%2F.logs%2Fslave-1%2C60020%2C1343977957962' - Unexpected response code 404, expected 200 2012-08-03 12:42:43,063 WARN org.jets3t.service.impl.rest.httpclient.RestS3Service: Response '/%2Fhbase%2F.logs%2Fslave-1%2C60020%2C1343977957962' - Received error response with XML message 2012-08-03 12:42:43,831 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: HLog configuration: blocksize=32 MB, rollsize=30.4 MB, enabled=true, optionallogflushinternal=1000ms 2012-08-03 12:42:43,840 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Failed initialization 2012-08-03 12:42:43,842 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init java.io.IOException: cannot get log writer at org.apache.hadoop.hbase.regionserver.wal.HLog.createWriter(HLog.java:678) at org.apache.hadoop.hbase.regionserver.wal.HLog.createWriterInstance(HLog.java:625) at org.apache.hadoop.hbase.regionserver.wal.HLog.rollWriter(HLog.java:557) at org.apache.hadoop.hbase.regionserver.wal.HLog.rollWriter(HLog.java:517) at org.apache.hadoop.hbase.regionserver.wal.HLog.<init>(HLog.java:405) at org.apache.hadoop.hbase.regionserver.wal.HLog.<init>(HLog.java:331) at org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateHLog(HRegionServer.java:1215) at org.apache.hadoop.hbase.regionserver.HRegionServer.setupWALAndReplication(HRegionServer.java:1204) at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:923) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:639) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: java.io.IOException: createNonRecursive unsupported for this filesystem class org.apache.hadoop.fs.s3.S3FileSystem at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.init(SequenceFileLogWriter.java:106) at org.apache.hadoop.hbase.regionserver.wal.HLog.createWriter(HLog.java:675) ... 10 more Caused by: java.io.IOException: createNonRecursive unsupported for this filesystem class org.apache.hadoop.fs.s3.S3FileSystem at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:626) at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:601) at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:442) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.init(SequenceFileLogWriter.java:87) ... 11 more 2012-08-03 12:42:43,847 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server slave-1,60020,1343977957962: Unhandled exception: cannot get log writer java.io.IOException: cannot get log writer at org.apache.hadoop.hbase.regionserver.wal.HLog.createWriter(HLog.java:678) at org.apache.hadoop.hbase.regionserver.wal.HLog.createWriterInstance(HLog.java:625) at org.apache.hadoop.hbase.regionserver.wal.HLog.rollWriter(HLog.java:557) at org.apache.hadoop.hbase.regionserver.wal.HLog.rollWriter(HLog.java:517) at org.apache.hadoop.hbase.regionserver.wal.HLog.<init>(HLog.java:405) at org.apache.hadoop.hbase.regionserver.wal.HLog.<init>(HLog.java:331) at org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateHLog(HRegionServer.java:1215) at org.apache.hadoop.hbase.regionserver.HRegionServer.setupWALAndReplication(HRegionServer.java:1204) at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:923) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:639) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: java.io.IOException: createNonRecursive unsupported for this filesystem class org.apache.hadoop.fs.s3.S3FileSystem at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.init(SequenceFileLogWriter.java:106) at org.apache.hadoop.hbase.regionserver.wal.HLog.createWriter(HLog.java:675) ... 10 more Caused by: java.io.IOException: createNonRecursive unsupported for this filesystem class org.apache.hadoop.fs.s3.S3FileSystem at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:626) at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:601) at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:442) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.init(SequenceFileLogWriter.java:87) ... 11 more 2012-08-03 12:42:43,848 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [] 2012-08-03 12:42:43,850 INFO or +
Alok Kumar 2012-08-03, 07:26
-
Re: Hadoop with S3 instead of local storageHarsh J 2012-08-03, 07:30
Alok,
Caused by: java.io.IOException: createNonRecursive unsupported for this filesystem class org.apache.hadoop.fs.s3.S3FileSystem This seems like a limitation imposed by HBase. Can you ask your question at [EMAIL PROTECTED] for the right people to answer you back? Also, can you run HBase in standalone mode (no RSes)? I believe thats how it may work on S3? On Fri, Aug 3, 2012 at 12:56 PM, Alok Kumar <[EMAIL PROTECTED]> wrote: > Thank you Harsh. > That clears my doubt for Hadoop with S3. > > Q. Does HBase communicate with S3 directly without using Hadoop? > > I've put this task aside for a while..! ..will post again. > I've not make it working yet. "jets3t jar" is present in classpath. > > Thanks, > Alok > > > HMaster is running .. > > Regionserver log : > > 2012-08-03 12:42:40,576 WARN > org.jets3t.service.impl.rest.httpclient.RestS3Service: Response > '/%2Fhbase%2F.logs%2Fslave-1%2C60020%2C1343977957962' - Unexpected response > code 404, expected 200 > 2012-08-03 12:42:40,576 WARN > org.jets3t.service.impl.rest.httpclient.RestS3Service: Response > '/%2Fhbase%2F.logs%2Fslave-1%2C60020%2C1343977957962' - Received error > response with XML message > 2012-08-03 12:42:43,063 WARN > org.jets3t.service.impl.rest.httpclient.RestS3Service: Response > '/%2Fhbase%2F.logs%2Fslave-1%2C60020%2C1343977957962' - Unexpected response > code 404, expected 200 > 2012-08-03 12:42:43,063 WARN > org.jets3t.service.impl.rest.httpclient.RestS3Service: Response > '/%2Fhbase%2F.logs%2Fslave-1%2C60020%2C1343977957962' - Received error > response with XML message > 2012-08-03 12:42:43,831 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: > HLog configuration: blocksize=32 MB, rollsize=30.4 MB, enabled=true, > optionallogflushinternal=1000ms > 2012-08-03 12:42:43,840 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Failed > initialization > 2012-08-03 12:42:43,842 ERROR > org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init > java.io.IOException: cannot get log writer > at > org.apache.hadoop.hbase.regionserver.wal.HLog.createWriter(HLog.java:678) > at > org.apache.hadoop.hbase.regionserver.wal.HLog.createWriterInstance(HLog.java:625) > at > org.apache.hadoop.hbase.regionserver.wal.HLog.rollWriter(HLog.java:557) > at > org.apache.hadoop.hbase.regionserver.wal.HLog.rollWriter(HLog.java:517) > at org.apache.hadoop.hbase.regionserver.wal.HLog.<init>(HLog.java:405) > at org.apache.hadoop.hbase.regionserver.wal.HLog.<init>(HLog.java:331) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateHLog(HRegionServer.java:1215) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.setupWALAndReplication(HRegionServer.java:1204) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:923) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:639) > at java.lang.Thread.run(Thread.java:662) > Caused by: java.io.IOException: java.io.IOException: createNonRecursive > unsupported for this filesystem class org.apache.hadoop.fs.s3.S3FileSystem > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.init(SequenceFileLogWriter.java:106) > at > org.apache.hadoop.hbase.regionserver.wal.HLog.createWriter(HLog.java:675) > ... 10 more > Caused by: java.io.IOException: createNonRecursive unsupported for this > filesystem class org.apache.hadoop.fs.s3.S3FileSystem > at > org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:626) > at > org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:601) > at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:442) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) Harsh J +
Harsh J 2012-08-03, 07:30
|