|
|
-
Re: Incremental import from PostgreSQL to Hive having issuesRoshan Pradeep 2012-04-13, 12:42
Hadoop - 0.20.2
Hive - 0.8.1 Thanks. On Fri, Apr 13, 2012 at 5:03 PM, Nitin Pawar <[EMAIL PROTECTED]>wrote: > can you tell us what is > 1) hive version > 2) hadoop version that you are using? > > > > > > On Fri, Apr 13, 2012 at 12:23 PM, Roshan Pradeep <[EMAIL PROTECTED]>wrote: > >> Hi >> >> I want to import the updated data from my source (PostgreSQL) to hive >> based on a column (lastmodifiedtime) in postgreSQL >> >> *The command I am using* >> >> /app/sqoop/bin/sqoop import --hive-table users --connect >> jdbc:postgresql:/<server_url>/<database> --table users --username XXXXXXX >> --password YYYYYY --hive-home /app/hive --hive-import --incremental >> lastmodified --check-column lastmodifiedtime >> >> *With the above command, I am getting the below error* >> >> 12/04/13 16:31:21 INFO orm.CompilationManager: Writing jar file: >> /tmp/sqoop-root/compile/11ce8600a5656ed49e631a260c387692/users.jar >> 12/04/13 16:31:21 INFO tool.ImportTool: Incremental import based on >> column "lastmodifiedtime" >> 12/04/13 16:31:21 INFO tool.ImportTool: Upper bound value: '2012-04-13 >> 16:31:21.865429' >> 12/04/13 16:31:21 WARN manager.PostgresqlManager: It looks like you are >> importing from postgresql. >> 12/04/13 16:31:21 WARN manager.PostgresqlManager: This transfer can be >> faster! Use the --direct >> 12/04/13 16:31:21 WARN manager.PostgresqlManager: option to exercise a >> postgresql-specific fast path. >> 12/04/13 16:31:21 INFO mapreduce.ImportJobBase: Beginning import of users >> 12/04/13 16:31:23 ERROR tool.ImportTool: Encountered IOException running >> import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output >> directory users already exists >> at >> org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:123) >> at >> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:770) >> at org.apache.hadoop.mapreduce.Job.submit(Job.java:432) >> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447) >> at >> org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:141) >> at >> org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:201) >> at >> org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:413) >> at >> org.apache.sqoop.manager.PostgresqlManager.importTable(PostgresqlManager.java:102) >> at >> org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:380) >> at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:453) >> at org.apache.sqoop.Sqoop.run(Sqoop.java:145) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181) >> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220) >> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229) >> at org.apache.sqoop.Sqoop.main(Sqoop.java:238) >> at com.cloudera.sqoop.Sqoop.main(Sqoop.java:57) >> >> According to the above, it identify the updated data from postgreSQL, but >> it says output directory already exists. Could someone please help me to >> correct this issue. >> >> Thanks. >> > > > > -- > Nitin Pawar > > |