Would this helps :-
https://issues.apache.org/jira/browse/SQOOP-390Thanks
On Fri, Sep 14, 2012 at 1:12 PM, Jarek Jarcec Cecho <[EMAIL PROTECTED]>wrote:
> Hi Adarsh,
> it seems as a bug to me. Would you mind creating a JIRA issue for that?
>
> Jarcec
>
> On Fri, Sep 14, 2012 at 09:57:45AM +0530, Adarsh Sharma wrote:
> > Ya sure, please have a look on below commands :-
> >
> > bin/sqoop job -- export --connect jdbc:postgresql://localhost/dbname
> > --export-dir /data/data.2012-09-08-00.csv --staging-table daily_tmp
> > --clear-staging-table --verbose --table daily --username abc --password
> > abc --input-fields-terminated-by '^A'
> >
> > Also attaching the output of job. But below lines explains everything :
> > 12/09/14 04:13:52 INFO mapreduce.ExportJobBase: Transferred 396.1099 MB
> in
> > 237.2008 seconds (1.6699 MB/sec)
> > 12/09/14 04:13:52 INFO mapreduce.ExportJobBase: Exported 4071315 records.
> > 12/09/14 04:13:52 INFO mapreduce.ExportJobBase: Starting to migrate data
> > from staging table to destination.
> > 12/09/14 04:14:29 INFO manager.SqlManager: Migrated 5321315 records from
> > daily_tmp to daily
> >
> > Total Records in CSV : 4071315 , Records Inserted : 52321315 ( due to
> fail
> > & rerun map tasks )
> > 12/09/14 04:11:57 INFO mapred.JobClient: map 79% reduce 0%
> > 12/09/14 04:12:00 INFO mapred.JobClient: map 80% reduce 0% ( Now map
> fails
> > )
> > 12/09/14 04:12:01 INFO mapred.JobClient: map 75% reduce 0%
> > 12/09/14 04:12:12 INFO mapred.JobClient: map 76% reduce 0%
> >
> > Please let me know if other info is reqd.
> >
> > Thanks
> >
> > On Thu, Sep 13, 2012 at 10:12 PM, Kathleen Ting <[EMAIL PROTECTED]>
> wrote:
> >
> > > Hi Adarsh, can you re-run with the --verbose option enabled? Also,
> > > please paste in the entire Sqoop command used.
> > >
> > > Thanks, Kathleen
> > >
> > > On Thu, Sep 13, 2012 at 7:53 AM, Adarsh Sharma <[EMAIL PROTECTED]>
> > > wrote:
> > > > Hi all,
> > > >
> > > > I am using sqoop-1.4.2 with cloudera hadoop and doing some tesing. We
> > > need
> > > > to export some tables from CSV's in HDFS.
> > > > As sqoop provides a mechanism of staging tables to write data in main
> > > tables
> > > > only if all maps are succeeded.
> > > >
> > > > While executing a sqoop job on hadoop , suppose a map fails & hadoop
> > > > reattempt the map to re-run and finish after 3 attempts, it results
> in
> > > > duplicate records in staging table and the job finished but data
> > > inserted is
> > > > higher than in CSV's. Below is the output :
> > > >
> > > > 12/09/13 14:46:55 INFO mapreduce.ExportJobBase: Exported 4071315
> records.
> > > > 12/09/13 14:46:55 INFO mapreduce.ExportJobBase: Starting to migrate
> data
> > > > from staging table to destination.
> > > > 12/09/13 14:47:29 INFO manager.SqlManager: Migrated 5391315 records
> from
> > > > table1_tmp to table
> > > >
> > > > Is this is a bug in Sqoop and is there any fix or patch for it.
> Please
> > > let
> > > > me know.
> > > >
> > > >
> > > > Thanks
> > >
>
> > adarsh@1002:~/sqoop-1.4.2.bin__hadoop-0.20$ bin/sqoop j export
> --connect jdbc:postgresql://localhost/dbname --export-dir
> /data/data.2012-09-08-00.csv --staging-table daily_tmp
> --clear-staging-table --verbose --table daily --username abc --password
> abc --input-fields-terminated-by '^A'
> > 12/09/14 04:09:53 DEBUG tool.BaseSqoopTool: Enabled debug logging.
> > 12/09/14 04:09:53 WARN tool.BaseSqoopTool: Setting your password on the
> command-line is insecure. Consider using -P instead.
> > 12/09/14 04:09:53 DEBUG sqoop.ConnFactory: Loaded manager factory:
> com.cloudera.sqoop.manager.DefaultManagerFactory
> > 12/09/14 04:09:53 DEBUG sqoop.ConnFactory: Trying ManagerFactory:
> com.cloudera.sqoop.manager.DefaultManagerFactory
> > 12/09/14 04:09:53 DEBUG manager.DefaultManagerFactory: Trying with
> scheme: jdbc:postgresql:
> > 12/09/14 04:09:53 INFO manager.SqlManager: Using default fetchSize of
> 1000
> > 12/09/14 04:09:53 DEBUG sqoop.ConnFactory: Instantiated ConnManager