|
|
+
Roshan Pradeep 2012-04-13, 06:53
-
Re: Incremental import from PostgreSQL to Hive having issuesNitin Pawar 2012-04-13, 07:03
can you tell us what is
1) hive version 2) hadoop version that you are using? On Fri, Apr 13, 2012 at 12:23 PM, Roshan Pradeep <[EMAIL PROTECTED]>wrote: > Hi > > I want to import the updated data from my source (PostgreSQL) to hive > based on a column (lastmodifiedtime) in postgreSQL > > *The command I am using* > > /app/sqoop/bin/sqoop import --hive-table users --connect > jdbc:postgresql:/<server_url>/<database> --table users --username XXXXXXX > --password YYYYYY --hive-home /app/hive --hive-import --incremental > lastmodified --check-column lastmodifiedtime > > *With the above command, I am getting the below error* > > 12/04/13 16:31:21 INFO orm.CompilationManager: Writing jar file: > /tmp/sqoop-root/compile/11ce8600a5656ed49e631a260c387692/users.jar > 12/04/13 16:31:21 INFO tool.ImportTool: Incremental import based on column > "lastmodifiedtime" > 12/04/13 16:31:21 INFO tool.ImportTool: Upper bound value: '2012-04-13 > 16:31:21.865429' > 12/04/13 16:31:21 WARN manager.PostgresqlManager: It looks like you are > importing from postgresql. > 12/04/13 16:31:21 WARN manager.PostgresqlManager: This transfer can be > faster! Use the --direct > 12/04/13 16:31:21 WARN manager.PostgresqlManager: option to exercise a > postgresql-specific fast path. > 12/04/13 16:31:21 INFO mapreduce.ImportJobBase: Beginning import of users > 12/04/13 16:31:23 ERROR tool.ImportTool: Encountered IOException running > import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output > directory users already exists > at > org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:123) > at > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:770) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:432) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447) > at > org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:141) > at > org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:201) > at > org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:413) > at > org.apache.sqoop.manager.PostgresqlManager.importTable(PostgresqlManager.java:102) > at > org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:380) > at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:453) > at org.apache.sqoop.Sqoop.run(Sqoop.java:145) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181) > at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220) > at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229) > at org.apache.sqoop.Sqoop.main(Sqoop.java:238) > at com.cloudera.sqoop.Sqoop.main(Sqoop.java:57) > > According to the above, it identify the updated data from postgreSQL, but > it says output directory already exists. Could someone please help me to > correct this issue. > > Thanks. > -- Nitin Pawar +
Roshan Pradeep 2012-04-13, 12:42
+
Nitin Pawar 2012-04-13, 13:13
+
Roshan Pradeep 2012-04-15, 22:56
+
Nitin Pawar 2012-04-16, 05:48
|