Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Incremental import from PostgreSQL to Hive having issues


+
Roshan Pradeep 2012-04-13, 06:53
Copy link to this message
-
Re: Incremental import from PostgreSQL to Hive having issues
can you tell us what is
1) hive version
2) hadoop version that you are using?

On Fri, Apr 13, 2012 at 12:23 PM, Roshan Pradeep <[EMAIL PROTECTED]>wrote:

> Hi
>
> I want to import the updated data from my source (PostgreSQL) to hive
> based on a column (lastmodifiedtime) in postgreSQL
>
> *The command I am using*
>
> /app/sqoop/bin/sqoop import --hive-table users --connect
> jdbc:postgresql:/<server_url>/<database> --table users --username XXXXXXX
> --password YYYYYY --hive-home /app/hive --hive-import --incremental
> lastmodified --check-column lastmodifiedtime
>
> *With the above command, I am getting the below error*
>
> 12/04/13 16:31:21 INFO orm.CompilationManager: Writing jar file:
> /tmp/sqoop-root/compile/11ce8600a5656ed49e631a260c387692/users.jar
> 12/04/13 16:31:21 INFO tool.ImportTool: Incremental import based on column
> "lastmodifiedtime"
> 12/04/13 16:31:21 INFO tool.ImportTool: Upper bound value: '2012-04-13
> 16:31:21.865429'
> 12/04/13 16:31:21 WARN manager.PostgresqlManager: It looks like you are
> importing from postgresql.
> 12/04/13 16:31:21 WARN manager.PostgresqlManager: This transfer can be
> faster! Use the --direct
> 12/04/13 16:31:21 WARN manager.PostgresqlManager: option to exercise a
> postgresql-specific fast path.
> 12/04/13 16:31:21 INFO mapreduce.ImportJobBase: Beginning import of users
> 12/04/13 16:31:23 ERROR tool.ImportTool: Encountered IOException running
> import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output
> directory users already exists
>         at
> org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:123)
>         at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:770)
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
>         at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
>         at
> org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:141)
>         at
> org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:201)
>         at
> org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:413)
>         at
> org.apache.sqoop.manager.PostgresqlManager.importTable(PostgresqlManager.java:102)
>         at
> org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:380)
>         at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:453)
>         at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
>         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
>         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
>         at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
>         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:57)
>
> According to the above, it identify the updated data from postgreSQL, but
> it says output directory already exists. Could someone please help me to
> correct this issue.
>
> Thanks.
>

--
Nitin Pawar
+
Roshan Pradeep 2012-04-13, 12:42
+
Nitin Pawar 2012-04-13, 13:13
+
Roshan Pradeep 2012-04-15, 22:56
+
Nitin Pawar 2012-04-16, 05:48
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB