Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - Incremental import from PostgreSQL to Hive having issues


+
Roshan Pradeep 2012-04-13, 06:53
Copy link to this message
-
Re: Incremental import from PostgreSQL to Hive having issues
Nitin Pawar 2012-04-13, 07:03
can you tell us what is
1) hive version
2) hadoop version that you are using?

On Fri, Apr 13, 2012 at 12:23 PM, Roshan Pradeep <[EMAIL PROTECTED]>wrote:

> Hi
>
> I want to import the updated data from my source (PostgreSQL) to hive
> based on a column (lastmodifiedtime) in postgreSQL
>
> *The command I am using*
>
> /app/sqoop/bin/sqoop import --hive-table users --connect
> jdbc:postgresql:/<server_url>/<database> --table users --username XXXXXXX
> --password YYYYYY --hive-home /app/hive --hive-import --incremental
> lastmodified --check-column lastmodifiedtime
>
> *With the above command, I am getting the below error*
>
> 12/04/13 16:31:21 INFO orm.CompilationManager: Writing jar file:
> /tmp/sqoop-root/compile/11ce8600a5656ed49e631a260c387692/users.jar
> 12/04/13 16:31:21 INFO tool.ImportTool: Incremental import based on column
> "lastmodifiedtime"
> 12/04/13 16:31:21 INFO tool.ImportTool: Upper bound value: '2012-04-13
> 16:31:21.865429'
> 12/04/13 16:31:21 WARN manager.PostgresqlManager: It looks like you are
> importing from postgresql.
> 12/04/13 16:31:21 WARN manager.PostgresqlManager: This transfer can be
> faster! Use the --direct
> 12/04/13 16:31:21 WARN manager.PostgresqlManager: option to exercise a
> postgresql-specific fast path.
> 12/04/13 16:31:21 INFO mapreduce.ImportJobBase: Beginning import of users
> 12/04/13 16:31:23 ERROR tool.ImportTool: Encountered IOException running
> import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output
> directory users already exists
>         at
> org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:123)
>         at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:770)
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
>         at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
>         at
> org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:141)
>         at
> org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:201)
>         at
> org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:413)
>         at
> org.apache.sqoop.manager.PostgresqlManager.importTable(PostgresqlManager.java:102)
>         at
> org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:380)
>         at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:453)
>         at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
>         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
>         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
>         at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
>         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:57)
>
> According to the above, it identify the updated data from postgreSQL, but
> it says output directory already exists. Could someone please help me to
> correct this issue.
>
> Thanks.
>

--
Nitin Pawar
+
Roshan Pradeep 2012-04-13, 12:42
+
Nitin Pawar 2012-04-13, 13:13
+
Roshan Pradeep 2012-04-15, 22:56
+
Nitin Pawar 2012-04-16, 05:48