|
Victor Sanchez
2012-06-27, 15:32
Cheolsoo Park
2012-06-27, 16:01
Cheolsoo Park
2012-06-27, 16:38
Victor Sanchez
2012-06-28, 09:49
Cheolsoo Park
2012-06-28, 21:19
Victor Sanchez
2012-06-29, 08:02
Cheolsoo Park
2012-06-30, 02:03
Victor Sanchez
2012-07-02, 14:21
Cheolsoo Park
2012-07-02, 22:04
Victor Sanchez
2012-07-03, 07:07
|
-
Sqoop 1.4.2 checkout from trunk (installation problem) -sqoop 1.4.1 incompatible with MSSQL Server ConnectorVictor Sanchez 2012-06-27, 15:32
Hi,
I have a test cluster that runs RHEL6. I installed Cloudera Manager 4 (which includes CDH4). I had installed SQOOP. # sqoop version Sqoop 1.4.1-cdh4.0.0 git commit id 44ef1bef07d93e3fcf79bdc1150de6c278ad7845 Compiled by jenkins on Mon Jun 4 17:43:14 PDT 2012 After all the installation configuration and stuff I ran into the problem on not been able to sqoop import. I figured out that there is a bug for MS SQL Connector for SQL Server 2008 R2 (https://issues.apache.org/jira/browse/SQOOP-480). So I checkout the code 'svn co https://svn.apache.org/repos/asf/sqoop/trunk/ sqoop' And I build a project by executing ant. I got as a result (inside the build folder) 2 jar files sqoop-1.4.2-incubating-SNAPSHOT.jar sqoop-test-1.4.2-incubating-SNAPSHOT.jar After all this I used this files for replacing the files in the instance with the sqoop installation. So I removed the jar files in /usr/lib/sqoop/ (sqoop-1.4.1-cdh4.0.0.jar and sqoop-test-1.4.1-cdh4.0.0.jar) replacing them with the files above. After that I get # sqoop version Sqoop 1.4.2-incubating-SNAPSHOT git commit id Compiled by victor.sanchez on Wed Jun 27 10:33:01 EDT 2012 But when I tried to run the list-tables ... it fails like this: # sqoop list-tables --connect 'jdbc:sqlserver://hadooptest01;username=victor;password=victor;database=hadoopDB_SQL' 12/06/27 16:18:29 ERROR tool.BaseSqoopTool: Got error creating database manager: java.lang.StringIndexOutOfBoundsException: String index out of range: -1 at java.lang.String.substring(String.java:1937) at org.apache.sqoop.ConnFactory.addManagersFromFile(ConnFactory.java:152) at org.apache.sqoop.ConnFactory.loadManagersFromConfDir(ConnFactory.java:224) at org.apache.sqoop.ConnFactory.instantiateFactories(ConnFactory.java:83) at org.apache.sqoop.ConnFactory.<init>(ConnFactory.java:60) at com.cloudera.sqoop.ConnFactory.<init>(ConnFactory.java:36) at org.apache.sqoop.tool.BaseSqoopTool.init(BaseSqoopTool.java:200) at org.apache.sqoop.tool.ListTablesTool.run(ListTablesTool.java:44) at org.apache.sqoop.Sqoop.run(Sqoop.java:145) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229) at org.apache.sqoop.Sqoop.main(Sqoop.java:238) at com.cloudera.sqoop.Sqoop.main(Sqoop.java:57) Notice if I put back the "old" jar files sqoop list-tables works, but of course the incompatibility bug (https://issues.apache.org/jira/browse/SQOOP-480) is still there. If anyone has an idea of how to update my current sqoop installation with my manual build I will appreciate any tip. Thanks in advance! /Victor Victor Sanchez Database Architect Net Entertainment NE AB, Luntmakargatan 18, SE-111 37, Stockholm, SE T: , M: 076 000 7297, F: +46 8 578 545 10 [EMAIL PROTECTED] www.netent.com Better Games This email and the information it contains are confidential and may be legally privileged and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify me immediately. Please note that any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company. You should not copy it for any purpose, or disclose its contents to any other person. Internet communications are not secure and, therefore, Net Entertainment does not accept legal responsibility for the contents of this message as it has been transmitted over a public network. If you suspect the message may have been intercepted or amended please call me. Finally, the recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. Thank you.
-
Re: Sqoop 1.4.2 checkout from trunk (installation problem) -sqoop 1.4.1 incompatible with MSSQL Server ConnectorCheolsoo Park 2012-06-27, 16:01
Hi Victor,
at org.apache.sqoop.ConnFactory.addManagersFromFile( > ConnFactory.java:152) I suspect that the error that you're seeing is a regression of SQOOP-505. Can you please what the content of your connector file looks like? For example, com.microsoft.sqoop.SqlServer.MSSQLServerManagerFactory=/usr/lib/sqoop/lib/sqoop-sqlserver-1.0.jar Thanks, Cheolsoo On Wed, Jun 27, 2012 at 8:32 AM, Victor Sanchez <[EMAIL PROTECTED]>wrote: > Hi, > > > > I have a test cluster that runs RHEL6. I installed Cloudera Manager 4 > (which includes CDH4). I had installed SQOOP. > > > > # sqoop version > > *Sqoop 1.4.1-cdh4.0.0* > > git commit id 44ef1bef07d93e3fcf79bdc1150de6c278ad7845 > > Compiled by jenkins on Mon Jun 4 17:43:14 PDT 2012 > > > > After all the installation configuration and stuff I ran into the problem > on not been able to sqoop import. I figured out that there is a bug for MS > SQL Connector for SQL Server 2008 R2 ( > https://issues.apache.org/jira/browse/SQOOP-480). > > > > So I checkout the code > > > > 'svn co https://svn.apache.org/repos/asf/sqoop/trunk/ sqoop' > > > > And I build a project by executing ant. I got as a result (inside the > build folder) 2 jar files > > > > *sqoop-1.4.2-incubating-SNAPSHOT.jar* > > *sqoop-test-1.4.2-incubating-SNAPSHOT.jar*** > > > > After all this I used this files for replacing the files in the instance > with the sqoop installation. > > So I removed the jar files in /usr/lib/sqoop/ (sqoop-1.4.1-cdh4.0.0.jar > and sqoop-test-1.4.1-cdh4.0.0.jar) replacing them with the files above. > > > > After that I get > > # sqoop version > > *Sqoop 1.4.2-incubating-SNAPSHOT* > > git commit id > > Compiled by victor.sanchez on Wed Jun 27 10:33:01 EDT 2012 > > > > But when I tried to run the list-tables … it fails like this: > > > > # sqoop list-tables --connect > 'jdbc:sqlserver://hadooptest01;username=victor;password=victor;database=hadoopDB_SQL' > > *12/06/27 16:18:29 ERROR tool.BaseSqoopTool: Got error creating database > manager: java.lang.StringIndexOutOfBoundsException: String index out of > range: -1* > > at java.lang.String.substring(String.java:1937) > > at > org.apache.sqoop.ConnFactory.addManagersFromFile(ConnFactory.java:152) > > at > org.apache.sqoop.ConnFactory.loadManagersFromConfDir(ConnFactory.java:224) > > at > org.apache.sqoop.ConnFactory.instantiateFactories(ConnFactory.java:83) > > at org.apache.sqoop.ConnFactory.<init>(ConnFactory.java:60) > > at com.cloudera.sqoop.ConnFactory.<init>(ConnFactory.java:36) > > at org.apache.sqoop.tool.BaseSqoopTool.init(BaseSqoopTool.java:200) > > at org.apache.sqoop.tool.ListTablesTool.run(ListTablesTool.java:44) > > at org.apache.sqoop.Sqoop.run(Sqoop.java:145) > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > > at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181) > > at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220) > > at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229) > > at org.apache.sqoop.Sqoop.main(Sqoop.java:238) > > at com.cloudera.sqoop.Sqoop.main(Sqoop.java:57) > > > > > > Notice if I put back the “old” jar files sqoop list-tables works, but of > course the incompatibility bug ( > https://issues.apache.org/jira/browse/SQOOP-480) is still there. > > > > If anyone has an idea of how to update my current sqoop installation with > my manual build I will appreciate any tip. > > > > Thanks in advance! > > > > /Victor > > Victor Sanchez > > Database Architect > > Net Entertainment NE AB, Luntmakargatan 18, SE-111 37, Stockholm, SE > T: , M: 076 000 7297, F: +46 8 578 545 10 > [EMAIL PROTECTED] www.netent.com > > Better Games > > > This email and the information it contains are confidential and may be > legally privileged and intended solely for the use of the individual or > entity to whom they are addressed. If you have received this email in error > please notify me immediately. Please note that any views or opinions
-
Re: Sqoop 1.4.2 checkout from trunk (installation problem) -sqoop 1.4.1 incompatible with MSSQL Server ConnectorCheolsoo Park 2012-06-27, 16:38
Hi Victor,
I was able to reproduce your error having the following connector file in the manager.d dir (/etc/sqoop/conf/managers.d): com.microsoft.sqoop.SqlServer.MSSQLServerManagerFactory Can you please double-check if you have any file that doesn't contain key-value pairs in the manager.d directory? If you do, that should be the problem. Thanks, Cheolsoo On Wed, Jun 27, 2012 at 9:01 AM, Cheolsoo Park <[EMAIL PROTECTED]>wrote: > Hi Victor, > > at org.apache.sqoop.ConnFactory.addManagersFromFile( >> ConnFactory.java:152) > > > I suspect that the error that you're seeing is a regression of > SQOOP-505. Can you please what the content of your connector file looks > like? For example, > > > com.microsoft.sqoop.SqlServer.MSSQLServerManagerFactory=/usr/lib/sqoop/lib/sqoop-sqlserver-1.0.jar > > Thanks, > Cheolsoo > > > On Wed, Jun 27, 2012 at 8:32 AM, Victor Sanchez <[EMAIL PROTECTED] > > wrote: > >> Hi, >> >> >> >> I have a test cluster that runs RHEL6. I installed Cloudera Manager 4 >> (which includes CDH4). I had installed SQOOP. >> >> >> >> # sqoop version >> >> *Sqoop 1.4.1-cdh4.0.0* >> >> git commit id 44ef1bef07d93e3fcf79bdc1150de6c278ad7845 >> >> Compiled by jenkins on Mon Jun 4 17:43:14 PDT 2012 >> >> >> >> After all the installation configuration and stuff I ran into the problem >> on not been able to sqoop import. I figured out that there is a bug for MS >> SQL Connector for SQL Server 2008 R2 ( >> https://issues.apache.org/jira/browse/SQOOP-480). >> >> >> >> So I checkout the code >> >> >> >> 'svn co https://svn.apache.org/repos/asf/sqoop/trunk/ sqoop' >> >> >> >> And I build a project by executing ant. I got as a result (inside the >> build folder) 2 jar files >> >> >> >> *sqoop-1.4.2-incubating-SNAPSHOT.jar* >> >> *sqoop-test-1.4.2-incubating-SNAPSHOT.jar*** >> >> >> >> After all this I used this files for replacing the files in the instance >> with the sqoop installation. >> >> So I removed the jar files in /usr/lib/sqoop/ (sqoop-1.4.1-cdh4.0.0.jar >> and sqoop-test-1.4.1-cdh4.0.0.jar) replacing them with the files above. >> >> >> >> After that I get >> >> # sqoop version >> >> *Sqoop 1.4.2-incubating-SNAPSHOT* >> >> git commit id >> >> Compiled by victor.sanchez on Wed Jun 27 10:33:01 EDT 2012 >> >> >> >> But when I tried to run the list-tables … it fails like this: >> >> >> >> # sqoop list-tables --connect >> 'jdbc:sqlserver://hadooptest01;username=victor;password=victor;database=hadoopDB_SQL' >> >> *12/06/27 16:18:29 ERROR tool.BaseSqoopTool: Got error creating database >> manager: java.lang.StringIndexOutOfBoundsException: String index out of >> range: -1* >> >> at java.lang.String.substring(String.java:1937) >> >> at >> org.apache.sqoop.ConnFactory.addManagersFromFile(ConnFactory.java:152) >> >> at >> org.apache.sqoop.ConnFactory.loadManagersFromConfDir(ConnFactory.java:224) >> >> at >> org.apache.sqoop.ConnFactory.instantiateFactories(ConnFactory.java:83) >> >> at org.apache.sqoop.ConnFactory.<init>(ConnFactory.java:60) >> >> at com.cloudera.sqoop.ConnFactory.<init>(ConnFactory.java:36) >> >> at >> org.apache.sqoop.tool.BaseSqoopTool.init(BaseSqoopTool.java:200) >> >> at >> org.apache.sqoop.tool.ListTablesTool.run(ListTablesTool.java:44) >> >> at org.apache.sqoop.Sqoop.run(Sqoop.java:145) >> >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) >> >> at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181) >> >> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220) >> >> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229) >> >> at org.apache.sqoop.Sqoop.main(Sqoop.java:238) >> >> at com.cloudera.sqoop.Sqoop.main(Sqoop.java:57) >> >> >> >> >> >> Notice if I put back the “old” jar files sqoop list-tables works, but of >> course the incompatibility bug ( >> https://issues.apache.org/jira/browse/SQOOP-480) is still there. >> >> >> >> If anyone has an idea of how to update my current sqoop installation with
-
RE: Sqoop 1.4.2 checkout from trunk (installation problem) -sqoop 1.4.1 incompatible with MSSQL Server ConnectorVictor Sanchez 2012-06-28, 09:49
Hi Cheolsoo,
Well as you mention there was com.microsoft.sqoop.SqlServer.MSSQLServerManagerFactory inside /etc/sqoop/conf/managers.d/ I removed and I now I can actually connect and list the tables but .... $ sqoop list-tables --connect 'jdbc:sqlserver://hadooptest01;username=victor;password=victor;database=hadoopSQL01' 12/06/28 13:33:38 INFO manager.SqlManager: Using default fetchSize of 1000 Table1 Table2 Table3 ... If I try to import I ran into another issue. $ sqoop import --connect 'jdbc:sqlserver://hadooptest01;username=victor;password=victor;database=hadoopSQL01' --table Table1 --target-dir /test/Table1 12/06/28 13:33:07 INFO manager.SqlManager: Using default fetchSize of 1000 12/06/28 13:33:07 INFO tool.CodeGenTool: Beginning code generation 12/06/28 13:33:08 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM Table1 AS t WHERE 1=0 12/06/28 13:33:08 INFO orm.CompilationManager: HADOOP_HOME is /usr/lib/hadoop Note: /tmp/sqoop-victor.sanchez/compile/5567c0bfbd9fd8af0ab8b0715c2245d3/ Table1.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 12/06/28 13:33:12 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-victor.sanchez/compile/5567c0bfbd9fd8af0ab8b0715c2245d3/Table1.jar 12/06/28 13:33:13 INFO mapreduce.ImportJobBase: Beginning import of Table1 12/06/28 13:33:13 WARN conf.Configuration: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 12/06/28 13:33:14 WARN conf.Configuration: mapred.jar is deprecated. Instead, use mapreduce.job.jar 12/06/28 13:33:16 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 12/06/28 13:33:16 INFO mapreduce.Cluster: Failed to use org.apache.hadoop.mapred.LocalClientProtocolProvider due to error: Invalid "mapreduce.jobtracker.address" configuration value for LocalJobRunner : "hadooptest-01.mydomain:8021" 12/06/28 13:33:16 ERROR security.UserGroupInformation: PriviledgedActionException as:victor.sanchez (auth:SIMPLE) cause:java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses. 12/06/28 13:33:16 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses. at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:121) at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:83) at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:76) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1196) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1192) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.mapreduce.Job.connect(Job.java:1191) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1220) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1244) at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:141) at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:202) at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:465) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:403) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:476) at org.apache.sqoop.Sqoop.run(Sqoop.java:145) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229) at org.apache.sqoop.Sqoop.main(Sqoop.java:238) at com.cloudera.sqoop.Sqoop.main(Sqoop.java:57) I had checked that all the services are running (Job tracker is up and I actually see history of jobs triggered from hue) and my user has all the permissions to write in the hfds directory that is been used as a target. Using google for a while I figure out that there might be another bug going on (but this time with mapreduce using YARN but I'm using MRv1 not YARN). If you have any tip for fixing this it will be highly appreciated! /Victor From: Cheolsoo Park [mailto:[EMAIL PROTECTED]] Sent: den 27 juni 2012 18:39 To: [EMAIL PROTECTED] Subject: Re: Sqoop 1.4.2 checkout from trunk (installation problem) -sqoop 1.4.1 incompatible with MSSQL Server Connector Hi Victor, I was able to reproduce your error having the following connector file in the manager.d dir (/etc/sqoop/conf/managers.d): com.microsoft.sqoop.SqlServer.MSSQLServerManagerFactory Can you please double-check if you have any file that doesn't contain key-value pairs in the manager.d directory? If you do, that should be the problem. Thanks, Cheolsoo On Wed, Jun 27, 2012 at 9:01 AM, Cheolsoo Park <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Hi Victor, at org.apache.sqoop.ConnFactory.addManagersFromFile(ConnFactory.java:152) I suspect that the error that you're seeing is a regression of SQOOP-505. Can you please what the content of your connector file looks like? For example, com.microsoft.sqoop.SqlServer.MSSQLServerManagerFactory=/usr/lib/sqoop/lib/sqoop-sqlserver-1.0.jar Thanks, Cheolsoo On Wed, Jun 27, 2012 at 8:32 AM, Victor Sanchez <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Hi, I have a test cluster that runs RHEL6. I installed Cloudera Manager 4 (which includes CDH4). I had installed SQOOP. # sqoop version Sqoop 1.4.1-cdh4.0.0 git commit id 44ef1bef07d93e3fcf79bdc1150de6c278ad7845 Compiled by jenkins on Mon Jun 4 17:43:14 PDT 2012 After all the installation configuration and stuff I ran into the problem on not been able to sqoop import. I figu
-
Re: Sqoop 1.4.2 checkout from trunk (installation problem) -sqoop 1.4.1 incompatible with MSSQL Server ConnectorCheolsoo Park 2012-06-28, 21:19
Hi Victor,
12/06/28 13:33:16 INFO mapreduce.Cluster: Failed to use > org.apache.hadoop.mapred.LocalClientProtocolProvider due to error: Invalid > "mapreduce.jobtracker.address" configuration value for LocalJobRunner : > "hadooptest-01.mydomain:8021" > 12/06/28 13:33:16 ERROR security.UserGroupInformation: > PriviledgedActionException as:victor.sanchez (auth:SIMPLE) > cause:java.io.IOException: Cannot initialize Cluster. Please check your > configuration for mapreduce.framework.name and the correspond server > addresses. > 12/06/28 13:33:16 ERROR tool.ImportTool: Encountered IOException running > import job: java.io.IOException: Cannot initialize Cluster. Please check > your configuration for mapreduce.framework.name and the correspond server > addresses. The exception is thrown because sqoop is assuming local mode while hadoop is configured in cluster mode. Provided that you're running hadoop in cluster mode, the question is now why sqoop assumes local mode. Would you mind providing the content of the following config files that CM4 generated for you in /etc/hadoop/conf? - core-site.xml - hdfs-site.xml - mapred-site.xml The apache mailing list strips off any attached files, so you will have to copy-and-paste them in an email. Thanks a lot, Cheolsoo On Thu, Jun 28, 2012 at 2:49 AM, Victor Sanchez <[EMAIL PROTECTED]>wrote: > Hi Cheolsoo,**** > > ** ** > > Well as you mention there was > com.microsoft.sqoop.SqlServer.MSSQLServerManagerFactory inside > /etc/sqoop/conf/managers.d/**** > > ** ** > > I removed and I now I can actually connect and list the tables but ….**** > > $ sqoop list-tables --connect > 'jdbc:sqlserver://hadooptest01;username=victor;password=victor;database=hadoopSQL01' > **** > > 12/06/28 13:33:38 INFO manager.SqlManager: Using default fetchSize of 1000 > **** > > Table1**** > > Table2**** > > Table3**** > > …**** > > ** ** > > If I try to import I ran into another issue. **** > > $ sqoop import --connect > 'jdbc:sqlserver://hadooptest01;username=victor;password=victor;database=hadoopSQL01' > --table Table1 --target-dir /test/Table1**** > > 12/06/28 13:33:07 INFO manager.SqlManager: Using default fetchSize of 1000 > **** > > 12/06/28 13:33:07 INFO tool.CodeGenTool: Beginning code generation**** > > 12/06/28 13:33:08 INFO manager.SqlManager: Executing SQL statement: SELECT > t.* FROM Table1 AS t WHERE 1=0**** > > 12/06/28 13:33:08 INFO orm.CompilationManager: HADOOP_HOME is > /usr/lib/hadoop**** > > Note: /tmp/sqoop-victor.sanchez/compile/5567c0bfbd9fd8af0ab8b0715c2245d3/ > Table1.java uses or overrides a deprecated API.**** > > Note: Recompile with -Xlint:deprecation for details.**** > > 12/06/28 13:33:12 INFO orm.CompilationManager: Writing jar file: > /tmp/sqoop-victor.sanchez/compile/5567c0bfbd9fd8af0ab8b0715c2245d3/Table1.jar > **** > > 12/06/28 13:33:13 INFO mapreduce.ImportJobBase: Beginning import of Table1 > **** > > 12/06/28 13:33:13 WARN conf.Configuration: mapred.job.tracker is > deprecated. Instead, use mapreduce.jobtracker.address**** > > 12/06/28 13:33:14 WARN conf.Configuration: mapred.jar is deprecated. > Instead, use mapreduce.job.jar**** > > 12/06/28 13:33:16 WARN conf.Configuration: mapred.map.tasks is deprecated. > Instead, use mapreduce.job.maps**** > > 12/06/28 13:33:16 INFO mapreduce.Cluster: Failed to use > org.apache.hadoop.mapred.LocalClientProtocolProvider due to error: Invalid > "mapreduce.jobtracker.address" configuration value for LocalJobRunner : > "hadooptest-01.mydomain:8021"**** > > 12/06/28 13:33:16 ERROR security.UserGroupInformation: > PriviledgedActionException as:victor.sanchez (auth:SIMPLE) > cause:java.io.IOException: Cannot initialize Cluster. Please check your > configuration for mapreduce.framework.name and the correspond server > addresses.**** > > 12/06/28 13:33:16 ERROR tool.ImportTool: Encountered IOException running > import job: java.io.IOException: Cannot initialize Cluster. Please check > your configuration for mapreduce.framework.name and the correspond server
-
RE: Sqoop 1.4.2 checkout from trunk (installation problem) -sqoop 1.4.1 incompatible with MSSQL Server ConnectorVictor Sanchez 2012-06-29, 08:02
Hi,
Here are the files you requested. $ cat core-site.xml <?xml version="1.0" encoding="UTF-8"?> <!--Autogenerated by Cloudera CM on 2012-06-28T08:44:35.321Z--> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://hadooptest-01.mydomain:8020</value> </property> <property> <name>io.file.buffer.size</name> <value>65536</value> </property> <property> <name>io.compression.codecs</name> <value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec, org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.DeflateCodec, org.apache.hadoop.io.compress.SnappyCodec</value> </property> <property> <name>hadoop.security.authentication</name> <value>simple</value> </property> <property> <name>hadoop.security.auth_to_local</name> <value>DEFAULT</value> </property> </configuration> $ cat hdfs-site.xml <?xml version="1.0" encoding="UTF-8"?> <!--Autogenerated by Cloudera CM on 2012-06-28T08:44:35.315Z--> <configuration> <property> <name>dfs.https.address</name> <value>hadooptest-01.mydomain:50470</value> </property> <property> <name>dfs.https.port</name> <value>50470</value> </property> <property> <name>dfs.namenode.http-address</name> <value>hadooptest-01.mydomain:50070</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.blocksize</name> <value>134217728</value> </property> <property> <name>dfs.client.use.datanode.hostname</name> <value>false</value> </property> </configuration> $ cat mapred-site.xml <?xml version="1.0" encoding="UTF-8"?> <!--Autogenerated by Cloudera CM on 2012-06-28T08:44:35.319Z--> <configuration> <property> <name>mapred.job.tracker</name> <value> hadooptest-01.mydomain:8021</value> </property> <property> <name>mapred.output.compress</name> <value>false</value> </property> <property> <name>mapred.output.compression.type</name> <value>BLOCK</value> </property> <property> <name>mapred.output.compression.codec</name> <value>org.apache.hadoop.io.compress.DefaultCodec</value> </property> <property> <name>mapred.map.output.compression.codec</name> <value>org.apache.hadoop.io.compress.SnappyCodec</value> </property> <property> <name>mapred.compress.map.output</name> <value>true</value> </property> <property> <name>io.sort.factor</name> <value>64</value> </property> <property> <name>io.sort.record.percent</name> <value>0.05</value> </property> <property> <name>io.sort.spill.percent</name> <value>0.8</value> </property> <property> <name>mapred.reduce.parallel.copies</name> <value>10</value> </property> <property> <name>mapred.submit.replication</name> <value>1</value> </property> <property> <name>mapred.reduce.tasks</name> <value>1</value> </property> <property> <name>io.sort.mb</name> <value>115</value> </property> <property> <name>mapred.child.java.opts</name> <value> -Xmx485373580</value> </property> <property> <name>mapred.job.reuse.jvm.num.tasks</name> <value>1</value> </property> <property> <name>mapred.map.tasks.speculative.execution</name> <value>false</value> </property> <property> <name>mapred.reduce.tasks.speculative.execution</name> <value>false</value> </property> <property> <name>mapred.reduce.slowstart.completed.maps</name> <value>0.8</value> </property> </configuration> Thanks for the help! /Victor From: Cheolsoo Park [mailto:[EMAIL PROTECTED]] Sent: den 28 juni 2012 23:20 To: [EMAIL PROTECTED] Subject: Re: Sqoop 1.4.2 checkout from trunk (installation problem) -sqoop 1.4.1 incompatible with MSSQL Server Connector Hi Victor, 12/06/28 13:33:16 INFO mapreduce.Cluster: Failed to use org.apache.hadoop.mapred.LocalClientProtocolProvider due to error: Invalid "mapreduce.jobtracker.address" configuration value for LocalJobRunner : "hadooptest-01.mydomain:8021" 12/06/28 13:33:16 ERROR security.UserGroupInformation: PriviledgedActionException as:victor.sanchez (auth:SIMPLE) cause:java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name<http://mapreduce.framework.name/> and the correspond server addresses. 12/06/28 13:33:16 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name<http://mapreduce.framework.name/> and the correspond server addresses. The exception is thrown because sqoop is assuming local mode while hadoop is configured in cluster mode. Provided that you're running hadoop in cluster mode, the question is now why sqoop assumes local mode. Would you mind providing the content of the following config files that CM4 generated for you in /etc/hadoop/conf? * core-site.xml * hdfs-site.xml * mapred-site.xml The apache mailing list strips off any attached files, so you will have to copy-and-paste them in an email. Thanks a lot, Cheolsoo On Thu, Jun 28, 2012 at 2:49 AM, Victor Sanchez <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Hi Cheolsoo, Well as you mention there was com.microsoft.sqoop.SqlServer.MSSQLServerManagerFactory inside /etc/sqoop/conf/managers.d/ I removed and I now I can actually connect and list the tables but .... $ sqoop list-tables --connect 'jdbc:sqlserver://hadooptest01;username=victor;password=victor;database=hadoopSQL01' 12/06/28 13:33:38 INFO manager.SqlManager: Using default fetchSize of 1000 Table1 Table2 Table3 ... If I try to import I ran into another issue. $ sqoop import --connect 'jdbc:sqlserver://hadooptest01;username=victor;password=victor;database=hadoopSQL01' --table Table1 --target-dir /test/Table1 12/06/28 13:33:07 INFO manager.SqlManager: Using default fetchS
-
Re: Sqoop 1.4.2 checkout from trunk (installation problem) -sqoop 1.4.1 incompatible with MSSQL Server ConnectorCheolsoo Park 2012-06-30, 02:03
Hi Victor,
I can't find anything wrong in your config files. In fact, I tried to install CDH4 via CM like you did, but I couldn't reproduce your error. I am wondering if you happen to have other config files in Sqoop's classpath. If you enable --verbose option, Sqoop prints out classpath to console. Do you see any conf directories other than "/etc/hadoop/conf" in classpath? If you have conflicting config files, you can run into this kind of problems. Thanks, Cheolsoo On Fri, Jun 29, 2012 at 1:02 AM, Victor Sanchez <[EMAIL PROTECTED]>wrote: > *Hi,* > > * * > > *Here are the files you requested.* > > * * > > *$ cat core-site.xml* > > <?xml version="1.0" encoding="UTF-8"?>**** > > ** ** > > <!--Autogenerated by Cloudera CM on 2012-06-28T08:44:35.321Z-->**** > > <configuration>**** > > <property>**** > > <name>fs.defaultFS</name>**** > > <value>hdfs://hadooptest-01.mydomain:8020</value>**** > > </property>**** > > <property>**** > > <name>io.file.buffer.size</name>**** > > <value>65536</value>**** > > </property>**** > > <property>**** > > <name>io.compression.codecs</name>**** > > > <value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec, > **** > > > org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.DeflateCodec, > **** > > org.apache.hadoop.io.compress.SnappyCodec</value>**** > > </property>**** > > <property>**** > > <name>hadoop.security.authentication</name>**** > > <value>simple</value>**** > > </property>**** > > <property>**** > > <name>hadoop.security.auth_to_local</name>**** > > <value>DEFAULT</value>**** > > </property>**** > > </configuration>**** > > ** ** > > *$ cat hdfs-site.xml* > > <?xml version="1.0" encoding="UTF-8"?>**** > > ** ** > > <!--Autogenerated by Cloudera CM on 2012-06-28T08:44:35.315Z-->**** > > <configuration>**** > > <property>**** > > <name>dfs.https.address</name>**** > > <value>hadooptest-01.mydomain:50470</value>**** > > </property>**** > > <property>**** > > <name>dfs.https.port</name>**** > > <value>50470</value>**** > > </property>**** > > <property>**** > > <name>dfs.namenode.http-address</name>**** > > <value>hadooptest-01.mydomain:50070</value>**** > > </property>**** > > <property>**** > > <name>dfs.replication</name>**** > > <value>3</value>**** > > </property>**** > > <property>**** > > <name>dfs.blocksize</name>**** > > <value>134217728</value>**** > > </property>**** > > <property>**** > > <name>dfs.client.use.datanode.hostname</name>**** > > <value>false</value>**** > > </property>**** > > </configuration>**** > > ** ** > > *$ cat mapred-site.xml* > > <?xml version="1.0" encoding="UTF-8"?>**** > > ** ** > > <!--Autogenerated by Cloudera CM on 2012-06-28T08:44:35.319Z-->**** > > <configuration>**** > > <property>**** > > <name>mapred.job.tracker</name>**** > > <value> hadooptest-01.mydomain:8021</value>**** > > </property>**** > > <property>**** > > <name>mapred.output.compress</name>**** > > <value>false</value>**** > > </property>**** > > <property>**** > > <name>mapred.output.compression.type</name>**** > > <value>BLOCK</value>**** > > </property>**** > > <property>**** > > <name>mapred.output.compression.codec</name>**** > > <value>org.apache.hadoop.io.compress.DefaultCodec</value>**** > > </property>**** > > <property>**** > > <name>mapred.map.output.compression.codec</name>**** > > <value>org.apache.hadoop.io.compress.SnappyCodec</value>**** > > </property>**** > > <property>**** > > <name>mapred.compress.map.output</name>**** > > <value>true</value>**** > > </property>**** > > <property>**** > > <name>io.sort.factor</name>**** > > <value>64</value>**** > > </property>**** > > <property>**** > > <name>io.sort.record.percent</name>**** > > <value>0.05</value>**** > > </property>**** > > <property>****
-
RE: Sqoop 1.4.2 checkout from trunk (installation problem) -sqoop 1.4.1 incompatible with MSSQL Server ConnectorVictor Sanchez 2012-07-02, 14:21
Hello Cheolsoo,
I was trying to dig more into the problem, so far I just noticed that as far as I'm using Hadoop version 2.0 there is some changes at least in the way you could make calls to the HDFS. Plus there are some extra directiories not present in previous versions. So I was wondering if that could be the problem. $ hadoop dfs -ls / DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. Found 5 items drwxrwxrwx - hue cloudAD 0 2012-06-28 12:41 /SQL_SQOOP ... $ hdfs dfs -ls / Found 5 items drwxrwxrwx - hue cloudAD 0 2012-06-28 12:41 /SQL_SQOOP ... $ ls -all /usr/lib/ | grep hadoop drwxr-xr-x 10 root root 4.0K Jun 15 18:42 hadoop/ -rw-r--r-- 1 root root 1.7M Jun 21 18:11 hadoop-0.20- drwxr-xr-x 12 root root 4.0K Jun 15 18:42 hadoop-0.20-mapreduce/ drwxr-xr-x 7 root root 4.0K Jun 15 18:35 hadoop-hdfs/ drwxr-xr-x 6 root root 4.0K Jun 15 18:37 hadoop-httpfs/ drwxr-xr-x 6 root root 4.0K Jun 15 18:39 hadoop-mapreduce/ drwxr-xr-x 7 root root 4.0K Jun 15 18:39 hadoop-yarn/ $ ls -all /usr/lib/hadoop total 3.3M drwxr-xr-x 10 root root 4.0K Jun 15 18:42 ./ dr-xr-xr-x. 40 root root 4.0K Jun 26 14:37 ../ drwxr-xr-x 2 root root 4.0K Jun 15 18:34 bin/ drwxr-xr-x 2 root root 4.0K Jun 15 18:42 client/ drwxr-xr-x 2 root root 4.0K Jun 15 18:42 client-0.20/ drwxr-xr-x 2 root root 4.0K Jun 15 19:07 cloudera/ drwxr-xr-x 2 root root 4.0K Jun 15 18:34 etc/ -rw-r--r-- 1 root root 17K Jun 5 02:09 hadoop-annotations-2.0.0-cdh4.0.0.jar lrwxrwxrwx 1 root root 37 Jun 15 18:34 hadoop-annotations.jar -> hadoop-annotations-2.0.0-cdh4.0.0.jar -rw-r--r-- 1 root root 43K Jun 5 02:09 hadoop-auth-2.0.0-cdh4.0.0.jar lrwxrwxrwx 1 root root 30 Jun 15 18:34 hadoop-auth.jar -> hadoop-auth-2.0.0-cdh4.0.0.jar -rw-r--r-- 1 root root 2.1M Jun 5 02:09 hadoop-common-2.0.0-cdh4.0.0.jar -rw-r--r-- 1 root root 1.1M Jun 5 02:09 hadoop-common-2.0.0-cdh4.0.0-tests.jar lrwxrwxrwx 1 root root 32 Jun 15 18:34 hadoop-common.jar -> hadoop-common-2.0.0-cdh4.0.0.jar drwxr-xr-x 3 root root 4.0K Jun 15 19:07 lib/ drwxr-xr-x 2 root root 4.0K Jun 15 18:39 libexec/ drwxr-xr-x 2 root root 4.0K Jun 15 18:34 sbin/ $ ls -all /usr/lib/hadoop-hdfs/ total 5.2M drwxr-xr-x 7 root root 4.0K Jun 15 18:35 ./ dr-xr-xr-x. 40 root root 4.0K Jun 26 14:37 ../ drwxr-xr-x 2 root root 4.0K Jul 2 15:25 bin/ drwxr-xr-x 2 root root 4.0K Jun 15 18:35 cloudera/ -rw-r--r-- 1 root root 3.8M Jun 5 02:09 hadoop-hdfs-2.0.0-cdh4.0.0.jar -rw-r--r-- 1 root root 1.4M Jun 5 02:09 hadoop-hdfs-2.0.0-cdh4.0.0-tests.jar lrwxrwxrwx 1 root root 30 Jun 15 18:35 hadoop-hdfs.jar -> hadoop-hdfs-2.0.0-cdh4.0.0.jar drwxr-xr-x 2 root root 4.0K Jun 15 18:35 lib/ drwxr-xr-x 2 root root 4.0K Jun 15 18:35 sbin/ drwxr-xr-x 5 root root 4.0K Jun 15 18:35 webapps/ Here is my env configuration $ env MSSQL_CONNECTOR_HOME=/usr/lib/sqoop-sqlserver-1.0 HOSTNAME=hadooptest01.mydom HADOOP_HOME=/usr/lib/hadoop SQOOP_CONF_DIR=/usr/lib/sqoop/conf SQOOP_HOME=/usr/lib/sqoop USER=victor.sanchez PATH=/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/java/jdk1.6.0_31/:/usr/lib/pig/conf:/usr/lib/hadoop-mapreduce:/usr/lib/hadoop/bin:/usr/lib/sqoop:/usr/lib/sqoop/conf:/usr/lib/hive:/usr/lib/sqoop-sqlserver-1.0:/home/victor.sanchez/bin HIVE_HOME=/usr/lib/hive PIG_CONF_DIR=/usr/lib/pig/conf JAVA_HOME=/usr/java/jdk1.6.0_31/ HOME=/home/victor.sanchez HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce Here is what I get when running sqoop import with the -verbose option. $ sqoop import --connect 'jdbc:sqlserver://hadooptest01.mydom;username=victordeploy;password=victordeploy;database=dmrs_gp_ftcl01' --table TestProduct --target-dir /test/TestProduct -verbose 12/07/02 18:00:46 DEBUG tool.BaseSqoopTool: Enabled debug logging. 12/07/02 18:00:46 DEBUG sqoop.ConnFactory: Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory 12/07/02 18:00:46 DEBUG sqoop.ConnFactory: Trying ManagerFactory: com.cloudera.sqoop.manager.DefaultManagerFactory 12/07/02 18:00:46 DEBUG manager.DefaultManagerFactory: Trying with scheme: jdbc:sqlserver: 12/07/02 18:00:46 INFO manager.SqlManager: Using default fetchSize of 1000 12/07/02 18:00:46 DEBUG sqoop.ConnFactory: Instantiated ConnManager org.apache.sqoop.manager.SQLServerManager@4760a26f 12/07/02 18:00:46 INFO tool.CodeGenTool: Beginning code generation 12/07/02 18:00:46 DEBUG manager.SqlManager: No connection paramenters specified. Using regular API for making connection. 12/07/02 18:00:47 DEBUG manager.SqlManager: Using fetchSize for next query: 1000 12/07/02 18:00:47 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM TestProduct AS t WHERE 1=0 12/07/02 18:00:47 DEBUG orm.ClassWriter: selected columns: 12/07/02 18:00:47 DEBUG orm.ClassWriter: Name 12/07/02 18:00:47 DEBUG orm.ClassWriter: Value 12/07/02 18:00:47 DEBUG orm.ClassWriter: Writing source file: /tmp/sqoop-victor.sanchez/compile/0b892db46ba494825e8e42f970335d83/TestProduct.java 12/07/02 18:00:47 DEBUG orm.ClassWriter: Table name: TestProduct 12/07/02 18:00:47 DEBUG orm.ClassWriter: Columns: Name:12, Value:12, 12/07/02 18:00:47 DEBUG orm.ClassWriter: sourceFilename is TestProduct.java 12/07/02 18:00:47 DEBUG orm.CompilationManager: Found existing /tmp/sqoop-victor.sanchez/compile/0b892db46ba494825e8e42f970335d83/ 12/07/02 18:00:47 INFO orm.CompilationManager: HADOOP_HOME is /usr/lib/hadoop 12/07/02 18:00:47 DEBUG orm.CompilationManager: Adding source file: /tmp/sqoop-victor.sanchez/compile/0b892db46ba494825e8e42f970335d83/TestProduct.java 12/07/02 18:00:47 DEBUG orm.CompilationManager: Invoking javac with args: 12/07/02 18:00:47 DEBUG orm.CompilationManager: -sourcepath 12/07/02 18:00:47 DEBUG orm.CompilationManager: /tmp/sqoop-victor.sanchez/compile/0b892db46ba494825e8e42f970335d83/ 12/07/02 18:00:47 DEBUG orm.CompilationManager: -d 12/07/02 18:00:47 DEBUG orm.CompilationManager: /tmp/sqoop-vic
-
Re: Sqoop 1.4.2 checkout from trunk (installation problem) -sqoop 1.4.1 incompatible with MSSQL Server ConnectorCheolsoo Park 2012-07-02, 22:04
Hi Victor,
Thank you for providing all the information. I think that I know what's the problem. HADOOP_MAPRED_HOME is set to "hadoop-mapreduce" (MR2) instead of "hadoop-0.20-mapreduce" (MR1). HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce This results in that MR2 jars are added to Sqoop's classpath instead of MR1 jars, and Sqoop ends up calling MR2 Hadoop at runtime. Even if you don't add yarn to your cluster in CM, both MR1 and MR2 are always installed, and HADOOP_MAPRED_HOME must point to the one that you're using. So in your case, it must be set to MR1: HADOOP_MAPRED_HOME=/usr/lib/hadoop-0.20-mapreduce Now I am wondering if you have HBase installed on the machine where you're running Sqoop. If so, that should explain why HADOOP_MAPRED_HOME is set to MR2. If you look at "/usr/lib/sqoop/bin/configure-sqoop", it invokes "hbase classpath" if hBase is installed: if [ -e "$HBASE_HOME/bin/hbase" ]; then > TMP_SQOOP_CLASSPATH=${SQOOP_CLASSPATH}:`$HBASE_HOME/bin/hbase classpath` > SQOOP_CLASSPATH=${TMP_SQOOP_CLASSPATH} > fi In turn, "/usr/bin/hbase" executes the following line: . /etc/default/hadoop And "/etc/default/hadoop" always sets HADOOP_MAPRED_HOME to MR2: export HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce In fact, this is a packaging bug in CDH4 as far as I am concerned because regardless whether the user sets HADOOP_MAPRED_HOME to MR1 or MR2, it will be reset to MR2 by "/usr/bin/hbase". I am going to open an internal bug report for this. As a workaround, you could edit the following line in "/etc/default/hadoop" from > export HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce to > export HADOOP_MAPRED_HOME=/usr/lib/hadoop-0.20-mapreduce If you're going to use MR1 only, this shouldn't cause any problem. Please let me know if this solves your problem. Thanks! Cheolsoo On Mon, Jul 2, 2012 at 7:21 AM, Victor Sanchez <[EMAIL PROTECTED]>wrote: > Hello Cheolsoo,**** > > ** ** > > I was trying to dig more into the problem, so far I just noticed that as > far as I’m using Hadoop version 2.0 there is some changes at least in the > way you could make calls to the HDFS. Plus there are some extra > directiories not present in previous versions. So I was wondering if that > could be the problem.**** > > ** ** > > *$ hadoop dfs -ls /* > > *DEPRECATED: Use of this script to execute hdfs command is deprecated.* > > *Instead use the hdfs command for it.* > > ** ** > > Found 5 items**** > > drwxrwxrwx - hue cloudAD 0 2012-06-28 12:41 /SQL_SQOOP**** > > …**** > > ** ** > > *$ hdfs dfs -ls /* > > Found 5 items**** > > drwxrwxrwx - hue cloudAD 0 2012-06-28 12:41 /SQL_SQOOP**** > > …**** > > ** ** > > *$ ls -all /usr/lib/ *| grep hadoop**** > > drwxr-xr-x 10 root root 4.0K Jun 15 18:42 hadoop/**** > > *-rw-r--r-- 1 root root 1.7M Jun 21 18:11 hadoop-0.20-* > > drwxr-xr-x 12 root root 4.0K Jun 15 18:42 hadoop-0.20-mapreduce/**** > > drwxr-xr-x 7 root root 4.0K Jun 15 18:35 hadoop-hdfs/**** > > drwxr-xr-x 6 root root 4.0K Jun 15 18:37 hadoop-httpfs/**** > > drwxr-xr-x 6 root root 4.0K Jun 15 18:39 hadoop-mapreduce/**** > > drwxr-xr-x 7 root root 4.0K Jun 15 18:39 hadoop-yarn/**** > > ** ** > > *$ ls -all /usr/lib/hadoop* > > total 3.3M**** > > drwxr-xr-x 10 root root 4.0K Jun 15 18:42 ./**** > > dr-xr-xr-x. 40 root root 4.0K Jun 26 14:37 ../**** > > drwxr-xr-x 2 root root 4.0K Jun 15 18:34 bin/**** > > drwxr-xr-x 2 root root 4.0K Jun 15 18:42 client/**** > > drwxr-xr-x 2 root root 4.0K Jun 15 18:42 client-0.20/**** > > drwxr-xr-x 2 root root 4.0K Jun 15 19:07 cloudera/**** > > drwxr-xr-x 2 root root 4.0K Jun 15 18:34 etc/**** > > -rw-r--r-- 1 root root 17K Jun 5 02:09 > hadoop-annotations-2.0.0-cdh4.0.0.jar**** > > lrwxrwxrwx 1 root root 37 Jun 15 18:34 hadoop-annotations.jar -> > hadoop-annotations-2.0.0-cdh4.0.0.jar**** > > -rw-r--r-- 1 root root 43K Jun 5 02:09 hadoop-auth-2.0.0-cdh4.0.0.jar* > *** > > lrwxrwxrwx 1 root root 30 Jun 15 18:34 hadoop-auth.jar ->
-
RE: Sqoop 1.4.2 checkout from trunk (installation problem) -sqoop 1.4.1 incompatible with MSSQL Server ConnectorVictor Sanchez 2012-07-03, 07:07
Hello Cheolsoo!
Good news it works! I just replaced my env variable as you said HADOOP_MAPRED_HOME=/usr/lib/hadoop-0.20-mapreduce now the sqoop import work without any problem! I'm not using HBASE the mistake was that I guess I misunderstood some part of the configuration manual and I set the HADOOP_MAPRED_HOME to work with MR2. Thank you very much for all the help with my case! Cheers! /Victor From: Cheolsoo Park [mailto:[EMAIL PROTECTED]] Sent: den 3 juli 2012 00:05 To: [EMAIL PROTECTED] Subject: Re: Sqoop 1.4.2 checkout from trunk (installation problem) -sqoop 1.4.1 incompatible with MSSQL Server Connector Hi Victor, Thank you for providing all the information. I think that I know what's the problem. HADOOP_MAPRED_HOME is set to "hadoop-mapreduce" (MR2) instead of "hadoop-0.20-mapreduce" (MR1). HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce This results in that MR2 jars are added to Sqoop's classpath instead of MR1 jars, and Sqoop ends up calling MR2 Hadoop at runtime. Even if you don't add yarn to your cluster in CM, both MR1 and MR2 are always installed, and HADOOP_MAPRED_HOME must point to the one that you're using. So in your case, it must be set to MR1: HADOOP_MAPRED_HOME=/usr/lib/hadoop-0.20-mapreduce Now I am wondering if you have HBase installed on the machine where you're running Sqoop. If so, that should explain why HADOOP_MAPRED_HOME is set to MR2. If you look at "/usr/lib/sqoop/bin/configure-sqoop", it invokes "hbase classpath" if hBase is installed: if [ -e "$HBASE_HOME/bin/hbase" ]; then TMP_SQOOP_CLASSPATH=${SQOOP_CLASSPATH}:`$HBASE_HOME/bin/hbase classpath` SQOOP_CLASSPATH=${TMP_SQOOP_CLASSPATH} fi In turn, "/usr/bin/hbase" executes the following line: . /etc/default/hadoop And "/etc/default/hadoop" always sets HADOOP_MAPRED_HOME to MR2: export HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce In fact, this is a packaging bug in CDH4 as far as I am concerned because regardless whether the user sets HADOOP_MAPRED_HOME to MR1 or MR2, it will be reset to MR2 by "/usr/bin/hbase". I am going to open an internal bug report for this. As a workaround, you could edit the following line in "/etc/default/hadoop" from export HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce to export HADOOP_MAPRED_HOME=/usr/lib/hadoop-0.20-mapreduce If you're going to use MR1 only, this shouldn't cause any problem. Please let me know if this solves your problem. Thanks! Cheolsoo On Mon, Jul 2, 2012 at 7:21 AM, Victor Sanchez <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Hello Cheolsoo, I was trying to dig more into the problem, so far I just noticed that as far as I'm using Hadoop version 2.0 there is some changes at least in the way you could make calls to the HDFS. Plus there are some extra directiories not present in previous versions. So I was wondering if that could be the problem. $ hadoop dfs -ls / DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. Found 5 items drwxrwxrwx - hue cloudAD 0 2012-06-28 12:41 /SQL_SQOOP ... $ hdfs dfs -ls / Found 5 items drwxrwxrwx - hue cloudAD 0 2012-06-28 12:41 /SQL_SQOOP ... $ ls -all /usr/lib/ | grep hadoop drwxr-xr-x 10 root root 4.0K Jun 15 18:42 hadoop/ -rw-r--r-- 1 root root 1.7M Jun 21 18:11 hadoop-0.20- drwxr-xr-x 12 root root 4.0K Jun 15 18:42 hadoop-0.20-mapreduce/ drwxr-xr-x 7 root root 4.0K Jun 15 18:35 hadoop-hdfs/ drwxr-xr-x 6 root root 4.0K Jun 15 18:37 hadoop-httpfs/ drwxr-xr-x 6 root root 4.0K Jun 15 18:39 hadoop-mapreduce/ drwxr-xr-x 7 root root 4.0K Jun 15 18:39 hadoop-yarn/ $ ls -all /usr/lib/hadoop total 3.3M drwxr-xr-x 10 root root 4.0K Jun 15 18:42 ./ dr-xr-xr-x. 40 root root 4.0K Jun 26 14:37 ../ drwxr-xr-x 2 root root 4.0K Jun 15 18:34 bin/ drwxr-xr-x 2 root root 4.0K Jun 15 18:42 client/ drwxr-xr-x 2 root root 4.0K Jun 15 18:42 client-0.20/ drwxr-xr-x 2 root root 4.0K Jun 15 19:07 cloudera/ drwxr-xr-x 2 root root 4.0K Jun 15 18:34 etc/ -rw-r--r-- 1 root root 17K Jun 5 02:09 hadoop-annotations-2.0.0-cdh4.0.0.jar lrwxrwxrwx 1 root root 37 Jun 15 18:34 hadoop-annotations.jar -> hadoop-annotations-2.0.0-cdh4.0.0.jar -rw-r--r-- 1 root root 43K Jun 5 02:09 hadoop-auth-2.0.0-cdh4.0.0.jar lrwxrwxrwx 1 root root 30 Jun 15 18:34 hadoop-auth.jar -> hadoop-auth-2.0.0-cdh4.0.0.jar -rw-r--r-- 1 root root 2.1M Jun 5 02:09 hadoop-common-2.0.0-cdh4.0.0.jar -rw-r--r-- 1 root root 1.1M Jun 5 02:09 hadoop-common-2.0.0-cdh4.0.0-tests.jar lrwxrwxrwx 1 root root 32 Jun 15 18:34 hadoop-common.jar -> hadoop-common-2.0.0-cdh4.0.0.jar drwxr-xr-x 3 root root 4.0K Jun 15 19:07 lib/ drwxr-xr-x 2 root root 4.0K Jun 15 18:39 libexec/ drwxr-xr-x 2 root root 4.0K Jun 15 18:34 sbin/ $ ls -all /usr/lib/hadoop-hdfs/ total 5.2M drwxr-xr-x 7 root root 4.0K Jun 15 18:35 ./ dr-xr-xr-x. 40 root root 4.0K Jun 26 14:37 ../ drwxr-xr-x 2 root root 4.0K Jul 2 15:25 bin/ drwxr-xr-x 2 root root 4.0K Jun 15 18:35 cloudera/ -rw-r--r-- 1 root root 3.8M Jun 5 02:09 hadoop-hdfs-2.0.0-cdh4.0.0.jar -rw-r--r-- 1 root root 1.4M Jun 5 02:09 hadoop-hdfs-2.0.0-cdh4.0.0-tests.jar lrwxrwxrwx 1 root root 30 Jun 15 18:35 hadoop-hdfs.jar -> hadoop-hdfs-2.0.0-cdh4.0.0.jar drwxr-xr-x 2 root root 4.0K Jun 15 18:35 lib/ drwxr-xr-x 2 root root 4.0K Jun 15 18:35 sbin/ drwxr-xr-x 5 root root 4.0K Jun 15 18:35 webapps/ Here is my env configuration $ env MSSQL_CONNECTOR_HOME=/usr/lib/sqoop-sqlserver-1.0 HOSTNAME=hadooptest01.mydom HADOOP_HOME=/usr/lib/hadoop SQOOP_CONF_DIR=/usr/lib/sqoop/conf SQOOP_HOME=/usr/lib/sqoop USER=victor.sanchez PATH=/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/java/jdk1.6.0_31/:/usr/lib/pig/conf:/usr/lib/hadoop-mapreduce:/usr/lib/hadoop/bin:/usr/lib/sqoop:/usr/lib/sqoop/conf:/usr/lib/hive:/usr/lib/sqoop-sqlserver-1.0:/home/victor.sanchez/bin HIVE_HOME=/usr/lib/hive PIG_CONF_DIR=/usr/lib/pig/conf JAVA_HOME=/usr/java/jdk1. |