|
|
-
sqoop exporting from hdfs to postgresql, using --staging-table option
Weihua Zhu 2012-01-23, 23:10
Hi Guys,
Good afternoon! I have a question. I was trying to sqoop exporting from hdfs to postgresql, using --staging-table options due to transactions consideration. But it gives me error below. I am wondering if the staging_able is supported for GenericJdbcManager? if not, what kind of manager should I use? Thanks very much!
-Weihua
error message:
12/01/23 15:00:39 ERROR tool.ExportTool: Error during export: The active connection manager (org.apache.sqoop.manager.GenericJdbcManager) does not support staging of data for export. Please retry without specifying the --staging-table option.
-
Re: sqoop exporting from hdfs to postgresql, using --staging-table option
Kathleen Ting 2012-01-24, 01:26
Hi Weihua - Unfortunately, the generic jdbc manager does not support staging. As a result, I've filed https://issues.apache.org/jira/browse/SQOOP-431 on your behalf. Regards, Kathleen On Mon, Jan 23, 2012 at 3:10 PM, Weihua Zhu <[EMAIL PROTECTED]> wrote: > Hi Guys, > > Good afternoon! > I have a question. I was trying to sqoop exporting from hdfs to > postgresql, using --staging-table options due to transactions > consideration. But it gives me error below. > I am wondering if the staging_able is supported for GenericJdbcManager? > if not, what kind of manager should I use? > Thanks very much! > > -Weihua > > error message: > > 12/01/23 15:00:39 ERROR tool.ExportTool: Error during export: The active > connection manager (org.apache.sqoop.manager.GenericJdbcManager) does not > support staging of data for export. Please retry without specifying the > --staging-table option. > > >
-
Re: sqoop exporting from hdfs to postgresql, using --staging-table option
Weihua Zhu 2012-01-25, 01:13
Hi Kathleen, Thanks very much for the help... Regards, -Weihua On Jan 23, 2012, at 5:26 PM, Kathleen Ting wrote: Hi Weihua - Unfortunately, the generic jdbc manager does not support staging. As a result, I've filed https://issues.apache.org/jira/browse/SQOOP-431 on your behalf. Regards, Kathleen On Mon, Jan 23, 2012 at 3:10 PM, Weihua Zhu <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Hi Guys, Good afternoon! I have a question. I was trying to sqoop exporting from hdfs to postgresql, using --staging-table options due to transactions consideration. But it gives me error below. I am wondering if the staging_able is supported for GenericJdbcManager? if not, what kind of manager should I use? Thanks very much! -Weihua error message: 12/01/23 15:00:39 ERROR tool.ExportTool: Error during export: The active connection manager (org.apache.sqoop.manager.GenericJdbcManager) does not support staging of data for export. Please retry without specifying the --staging-table option.
-
Re: sqoop exporting from hdfs to postgresql, using --staging-table option
Srinivas Surasani 2012-01-25, 23:36
Hi Kathleen, Same issue with Teradata. Srinivas -- On Mon, Jan 23, 2012 at 8:26 PM, Kathleen Ting <[EMAIL PROTECTED]>wrote: > Hi Weihua - > > Unfortunately, the generic jdbc manager does not support staging. > As a result, I've filed https://issues.apache.org/jira/browse/SQOOP-431on your behalf. > > Regards, Kathleen > > > On Mon, Jan 23, 2012 at 3:10 PM, Weihua Zhu <[EMAIL PROTECTED]> wrote: > >> Hi Guys, >> >> Good afternoon! >> I have a question. I was trying to sqoop exporting from hdfs to >> postgresql, using --staging-table options due to transactions >> consideration. But it gives me error below. >> I am wondering if the staging_able is supported for GenericJdbcManager? >> if not, what kind of manager should I use? >> Thanks very much! >> >> -Weihua >> >> error message: >> >> 12/01/23 15:00:39 ERROR tool.ExportTool: Error during export: The active >> connection manager (org.apache.sqoop.manager.GenericJdbcManager) does not >> support staging of data for export. Please retry without specifying the >> --staging-table option. >> >> >> >
-
Re: sqoop exporting from hdfs to postgresql, using --staging-table option
Kathleen Ting 2012-01-26, 01:01
Srinivas, as it happens, the Cloudera Connector for Teradata supports staging tables. It is freely available here: https://ccp.cloudera.com/display/con/Cloudera+Connector+for+Teradata+Download. Regards, Kathleen On Wed, Jan 25, 2012 at 3:36 PM, Srinivas Surasani <[EMAIL PROTECTED]> wrote: > Hi Kathleen, > > Same issue with Teradata. > > > Srinivas -- > > > On Mon, Jan 23, 2012 at 8:26 PM, Kathleen Ting <[EMAIL PROTECTED]>wrote: > >> Hi Weihua - >> >> Unfortunately, the generic jdbc manager does not support staging. >> As a result, I've filed https://issues.apache.org/jira/browse/SQOOP-431on your behalf. >> >> Regards, Kathleen >> >> >> On Mon, Jan 23, 2012 at 3:10 PM, Weihua Zhu <[EMAIL PROTECTED]> wrote: >> >>> Hi Guys, >>> >>> Good afternoon! >>> I have a question. I was trying to sqoop exporting from hdfs to >>> postgresql, using --staging-table options due to transactions >>> consideration. But it gives me error below. >>> I am wondering if the staging_able is supported for >>> GenericJdbcManager? if not, what kind of manager should I use? >>> Thanks very much! >>> >>> -Weihua >>> >>> error message: >>> >>> 12/01/23 15:00:39 ERROR tool.ExportTool: Error during export: The active >>> connection manager (org.apache.sqoop.manager.GenericJdbcManager) does not >>> support staging of data for export. Please retry without specifying the >>> --staging-table option. >>> >>> >>> >> >
-
Re: sqoop exporting from hdfs to postgresql, using --staging-table option
Srinivas Surasani 2012-01-26, 01:39
Yes, I looked into it. Thanks for the info. Regards, Srinivas -- On Wed, Jan 25, 2012 at 8:01 PM, Kathleen Ting <[EMAIL PROTECTED]>wrote: > Srinivas, as it happens, the Cloudera Connector for Teradata supports > staging tables. It is freely available here: > https://ccp.cloudera.com/display/con/Cloudera+Connector+for+Teradata+Download> . > > Regards, Kathleen > > On Wed, Jan 25, 2012 at 3:36 PM, Srinivas Surasani <[EMAIL PROTECTED]>wrote: > >> Hi Kathleen, >> >> Same issue with Teradata. >> >> >> Srinivas -- >> >> >> On Mon, Jan 23, 2012 at 8:26 PM, Kathleen Ting <[EMAIL PROTECTED]>wrote: >> >>> Hi Weihua - >>> >>> Unfortunately, the generic jdbc manager does not support staging. >>> As a result, I've filed https://issues.apache.org/jira/browse/SQOOP-431on your behalf. >>> >>> Regards, Kathleen >>> >>> >>> On Mon, Jan 23, 2012 at 3:10 PM, Weihua Zhu <[EMAIL PROTECTED]> wrote: >>> >>>> Hi Guys, >>>> >>>> Good afternoon! >>>> I have a question. I was trying to sqoop exporting from hdfs to >>>> postgresql, using --staging-table options due to transactions >>>> consideration. But it gives me error below. >>>> I am wondering if the staging_able is supported for >>>> GenericJdbcManager? if not, what kind of manager should I use? >>>> Thanks very much! >>>> >>>> -Weihua >>>> >>>> error message: >>>> >>>> 12/01/23 15:00:39 ERROR tool.ExportTool: Error during export: The >>>> active connection manager (org.apache.sqoop.manager.GenericJdbcManager) >>>> does not support staging of data for export. Please retry without >>>> specifying the --staging-table option. >>>> >>>> >>>> >>> >> >
-
Re: sqoop exporting from hdfs to postgresql, using --staging-table option
Srinivas Surasani 2012-01-26, 01:43
Kathleen, Any information on below request. Hi All, I'm working on Hadoop CDH3 U0 and Sqoop CDH3 U2. I'm trying to export csv files from HDFS to Teradata, it works well with setting mapper to "1" ( with batch loading of 1000 records at a time ). when I tried increasing the number of mappers to more than one I'm getting the following error. Also, is it possible to configure batch size at the time of export ( from the command line)?? sqoop export --verbose --driver com.teradata.jdbc.TeraDriver --connect jdbc:teradata://xxxx/database=xxxx --username xxxxx --password xxxxx --table xxxx --export-dir /user/surasani/10minutes.txt --fields-terminated-by '|' -m 4 --batch 12/01/24 16:17:21 INFO mapred.JobClient: map 3% reduce 0% 12/01/24 16:17:48 INFO mapred.JobClient: Task Id : attempt_201112211106_68553_m_000001_2, Status : FAILED *java.io.IOException: java.sql.BatchUpdateException: [Teradata Database] [TeraJDBC 13.00.00.07] [Error 2631] [SQLState 40001] Transaction ABORTed due to DeadLock*. Srinivas -- On Wed, Jan 25, 2012 at 8:01 PM, Kathleen Ting <[EMAIL PROTECTED]>wrote: > Srinivas, as it happens, the Cloudera Connector for Teradata supports > staging tables. It is freely available here: > https://ccp.cloudera.com/display/con/Cloudera+Connector+for+Teradata+Download> . > > Regards, Kathleen > > On Wed, Jan 25, 2012 at 3:36 PM, Srinivas Surasani <[EMAIL PROTECTED]>wrote: > >> Hi Kathleen, >> >> Same issue with Teradata. >> >> >> Srinivas -- >> >> >> On Mon, Jan 23, 2012 at 8:26 PM, Kathleen Ting <[EMAIL PROTECTED]>wrote: >> >>> Hi Weihua - >>> >>> Unfortunately, the generic jdbc manager does not support staging. >>> As a result, I've filed https://issues.apache.org/jira/browse/SQOOP-431on your behalf. >>> >>> Regards, Kathleen >>> >>> >>> On Mon, Jan 23, 2012 at 3:10 PM, Weihua Zhu <[EMAIL PROTECTED]> wrote: >>> >>>> Hi Guys, >>>> >>>> Good afternoon! >>>> I have a question. I was trying to sqoop exporting from hdfs to >>>> postgresql, using --staging-table options due to transactions >>>> consideration. But it gives me error below. >>>> I am wondering if the staging_able is supported for >>>> GenericJdbcManager? if not, what kind of manager should I use? >>>> Thanks very much! >>>> >>>> -Weihua >>>> >>>> error message: >>>> >>>> 12/01/23 15:00:39 ERROR tool.ExportTool: Error during export: The >>>> active connection manager (org.apache.sqoop.manager.GenericJdbcManager) >>>> does not support staging of data for export. Please retry without >>>> specifying the --staging-table option. >>>> >>>> >>>> >>> >> >
-
Re: sqoop exporting from hdfs to postgresql, using --staging-table option
abhijeet gaikwad 2012-01-28, 15:12
Hi Srinivas, Haven't played with Teradata Connector, but in general there are two properties that drive insertions (SQOOP Export) in a table - namely: 1. "sqoop.export.records.per.statement" : This property is used to specify the number of records/rows to be inserted using a single INSERT statement. Default value is 100. 2. "sqoop.export.statements.per.transaction" : This property is used to specify the number the insert statements before a commit is fired - which you can call batch size. Default value is 100. You can use -D hadoop argument to specify these properties at command line. E.g. -Dsqoop.export.statements.per.transaction=50 NOTE: Make sure you use this argument(-D) before using any of the SQOOP tool specific arguments. See SQOOP User Guide for more details. Thanks, Abhijeet Gaikwad On 1/26/12, Srinivas Surasani <[EMAIL PROTECTED]> wrote: > Kathleen, > > Any information on below request. > > Hi All, > > I'm working on Hadoop CDH3 U0 and Sqoop CDH3 U2. > > I'm trying to export csv files from HDFS to Teradata, it works well with > setting mapper to "1" ( with batch loading of 1000 records at a time ). > when I tried increasing the number of mappers to more than one I'm getting > the following error. Also, is it possible to configure batch size at the > time of export ( from the command line)?? > > > sqoop export --verbose --driver com.teradata.jdbc.TeraDriver > --connect jdbc:teradata://xxxx/database=xxxx --username xxxxx --password > xxxxx --table xxxx --export-dir /user/surasani/10minutes.txt > --fields-terminated-by '|' -m 4 --batch > > 12/01/24 16:17:21 INFO mapred.JobClient: map 3% reduce 0% > 12/01/24 16:17:48 INFO mapred.JobClient: Task Id : > attempt_201112211106_68553_m_000001_2, Status : FAILED > *java.io.IOException: java.sql.BatchUpdateException: [Teradata Database] > [TeraJDBC 13.00.00.07] [Error 2631] [SQLState 40001] Transaction ABORTed > due to DeadLock*. > > Srinivas -- > > On Wed, Jan 25, 2012 at 8:01 PM, Kathleen Ting <[EMAIL PROTECTED]>wrote: > >> Srinivas, as it happens, the Cloudera Connector for Teradata supports >> staging tables. It is freely available here: >> https://ccp.cloudera.com/display/con/Cloudera+Connector+for+Teradata+Download>> . >> >> Regards, Kathleen >> >> On Wed, Jan 25, 2012 at 3:36 PM, Srinivas Surasani >> <[EMAIL PROTECTED]>wrote: >> >>> Hi Kathleen, >>> >>> Same issue with Teradata. >>> >>> >>> Srinivas -- >>> >>> >>> On Mon, Jan 23, 2012 at 8:26 PM, Kathleen Ting >>> <[EMAIL PROTECTED]>wrote: >>> >>>> Hi Weihua - >>>> >>>> Unfortunately, the generic jdbc manager does not support staging. >>>> As a result, I've filed >>>> https://issues.apache.org/jira/browse/SQOOP-431on your behalf. >>>> >>>> Regards, Kathleen >>>> >>>> >>>> On Mon, Jan 23, 2012 at 3:10 PM, Weihua Zhu <[EMAIL PROTECTED]> wrote: >>>> >>>>> Hi Guys, >>>>> >>>>> Good afternoon! >>>>> I have a question. I was trying to sqoop exporting from hdfs to >>>>> postgresql, using --staging-table options due to transactions >>>>> consideration. But it gives me error below. >>>>> I am wondering if the staging_able is supported for >>>>> GenericJdbcManager? if not, what kind of manager should I use? >>>>> Thanks very much! >>>>> >>>>> -Weihua >>>>> >>>>> error message: >>>>> >>>>> 12/01/23 15:00:39 ERROR tool.ExportTool: Error during export: The >>>>> active connection manager (org.apache.sqoop.manager.GenericJdbcManager) >>>>> does not support staging of data for export. Please retry without >>>>> specifying the --staging-table option. >>>>> >>>>> >>>>> >>>> >>> >> >
-
Re: sqoop exporting from hdfs to postgresql, using --staging-table option
Srinivas Surasani 2012-01-28, 16:46
Hi Abhijeet --, Thanks for the information. I have one more question. Is the exports is done always with one mapper? ( entering into table deadlocks if number of mappers set to more than one ). Also, FYI: I have observed the default number of rows inserted is 1000. Thanks, -- Srinvas On Sat, Jan 28, 2012 at 10:12 AM, abhijeet gaikwad <[EMAIL PROTECTED]>wrote: > Hi Srinivas, > Haven't played with Teradata Connector, but in general there are two > properties that drive insertions (SQOOP Export) in a table - namely: > > 1. "sqoop.export.records.per.statement" : This property is used to > specify the number of records/rows to be inserted using a single > INSERT statement. Default value is 100. > 2. "sqoop.export.statements.per.transaction" : This property is used > to specify the number the insert statements before a commit is fired - > which you can call batch size. Default value is 100. > > You can use -D hadoop argument to specify these properties at command > line. E.g. -Dsqoop.export.statements.per.transaction=50 > > NOTE: Make sure you use this argument(-D) before using any of the > SQOOP tool specific arguments. See SQOOP User Guide for more details. > > Thanks, > Abhijeet Gaikwad > > On 1/26/12, Srinivas Surasani <[EMAIL PROTECTED]> wrote: > > Kathleen, > > > > Any information on below request. > > > > Hi All, > > > > I'm working on Hadoop CDH3 U0 and Sqoop CDH3 U2. > > > > I'm trying to export csv files from HDFS to Teradata, it works well with > > setting mapper to "1" ( with batch loading of 1000 records at a time ). > > when I tried increasing the number of mappers to more than one I'm > getting > > the following error. Also, is it possible to configure batch size at the > > time of export ( from the command line)?? > > > > > > sqoop export --verbose --driver com.teradata.jdbc.TeraDriver > > --connect jdbc:teradata://xxxx/database=xxxx --username xxxxx --password > > xxxxx --table xxxx --export-dir /user/surasani/10minutes.txt > > --fields-terminated-by '|' -m 4 --batch > > > > 12/01/24 16:17:21 INFO mapred.JobClient: map 3% reduce 0% > > 12/01/24 16:17:48 INFO mapred.JobClient: Task Id : > > attempt_201112211106_68553_m_000001_2, Status : FAILED > > *java.io.IOException: java.sql.BatchUpdateException: [Teradata Database] > > [TeraJDBC 13.00.00.07] [Error 2631] [SQLState 40001] Transaction ABORTed > > due to DeadLock*. > > > > Srinivas -- > > > > On Wed, Jan 25, 2012 at 8:01 PM, Kathleen Ting <[EMAIL PROTECTED] > >wrote: > > > >> Srinivas, as it happens, the Cloudera Connector for Teradata supports > >> staging tables. It is freely available here: > >> > https://ccp.cloudera.com/display/con/Cloudera+Connector+for+Teradata+Download> >> . > >> > >> Regards, Kathleen > >> > >> On Wed, Jan 25, 2012 at 3:36 PM, Srinivas Surasani > >> <[EMAIL PROTECTED]>wrote: > >> > >>> Hi Kathleen, > >>> > >>> Same issue with Teradata. > >>> > >>> > >>> Srinivas -- > >>> > >>> > >>> On Mon, Jan 23, 2012 at 8:26 PM, Kathleen Ting > >>> <[EMAIL PROTECTED]>wrote: > >>> > >>>> Hi Weihua - > >>>> > >>>> Unfortunately, the generic jdbc manager does not support staging. > >>>> As a result, I've filed > >>>> https://issues.apache.org/jira/browse/SQOOP-431on your behalf. > >>>> > >>>> Regards, Kathleen > >>>> > >>>> > >>>> On Mon, Jan 23, 2012 at 3:10 PM, Weihua Zhu <[EMAIL PROTECTED]> > wrote: > >>>> > >>>>> Hi Guys, > >>>>> > >>>>> Good afternoon! > >>>>> I have a question. I was trying to sqoop exporting from hdfs to > >>>>> postgresql, using --staging-table options due to transactions > >>>>> consideration. But it gives me error below. > >>>>> I am wondering if the staging_able is supported for > >>>>> GenericJdbcManager? if not, what kind of manager should I use? > >>>>> Thanks very much! > >>>>> > >>>>> -Weihua > >>>>> > >>>>> error message: > >>>>> > >>>>> 12/01/23 15:00:39 ERROR tool.ExportTool: Error during export: The > >>>>> active connection manager > (org.apache.sqoop.manager.GenericJdbcManager) > >>>>> does not support staging of data for export. Please retry without
-
Re: sqoop exporting from hdfs to postgresql, using --staging-table option
abhijeet gaikwad 2012-01-28, 17:47
Hi Srinivas, Export with multiple mappers is allowed in SQOOP. I have exported data into Sql Server as well as MySql using multiple mappers. Regarding the issue you are facing, I have few questions: 1. How did you set the batch size, 1000 you are talking about. 2. Can you share SQOOP logs in detail? 3. Deadlock issue seems to have raised by Teradata. There is an equal probability that a Teradata admin will be able to resolve this issue. Thanks, Abhijeet Gaikwad On 28 Jan 2012 22:16, "Srinivas Surasani" <[EMAIL PROTECTED]> wrote: > Hi Abhijeet --, > > Thanks for the information. I have one more question. Is the exports is > done always with one mapper? ( entering into table deadlocks if number of > mappers set to more than one ). > Also, FYI: I have observed the default number of rows inserted is 1000. > > Thanks, > -- Srinvas > > On Sat, Jan 28, 2012 at 10:12 AM, abhijeet gaikwad <[EMAIL PROTECTED] > > wrote: > >> Hi Srinivas, >> Haven't played with Teradata Connector, but in general there are two >> properties that drive insertions (SQOOP Export) in a table - namely: >> >> 1. "sqoop.export.records.per.statement" : This property is used to >> specify the number of records/rows to be inserted using a single >> INSERT statement. Default value is 100. >> 2. "sqoop.export.statements.per.transaction" : This property is used >> to specify the number the insert statements before a commit is fired - >> which you can call batch size. Default value is 100. >> >> You can use -D hadoop argument to specify these properties at command >> line. E.g. -Dsqoop.export.statements.per.transaction=50 >> >> NOTE: Make sure you use this argument(-D) before using any of the >> SQOOP tool specific arguments. See SQOOP User Guide for more details. >> >> Thanks, >> Abhijeet Gaikwad >> >> On 1/26/12, Srinivas Surasani <[EMAIL PROTECTED]> wrote: >> > Kathleen, >> > >> > Any information on below request. >> > >> > Hi All, >> > >> > I'm working on Hadoop CDH3 U0 and Sqoop CDH3 U2. >> > >> > I'm trying to export csv files from HDFS to Teradata, it works well with >> > setting mapper to "1" ( with batch loading of 1000 records at a time ). >> > when I tried increasing the number of mappers to more than one I'm >> getting >> > the following error. Also, is it possible to configure batch size at the >> > time of export ( from the command line)?? >> > >> > >> > sqoop export --verbose --driver com.teradata.jdbc.TeraDriver >> > --connect jdbc:teradata://xxxx/database=xxxx --username xxxxx >> --password >> > xxxxx --table xxxx --export-dir /user/surasani/10minutes.txt >> > --fields-terminated-by '|' -m 4 --batch >> > >> > 12/01/24 16:17:21 INFO mapred.JobClient: map 3% reduce 0% >> > 12/01/24 16:17:48 INFO mapred.JobClient: Task Id : >> > attempt_201112211106_68553_m_000001_2, Status : FAILED >> > *java.io.IOException: java.sql.BatchUpdateException: [Teradata Database] >> > [TeraJDBC 13.00.00.07] [Error 2631] [SQLState 40001] Transaction ABORTed >> > due to DeadLock*. >> > >> > Srinivas -- >> > >> > On Wed, Jan 25, 2012 at 8:01 PM, Kathleen Ting <[EMAIL PROTECTED] >> >wrote: >> > >> >> Srinivas, as it happens, the Cloudera Connector for Teradata supports >> >> staging tables. It is freely available here: >> >> >> https://ccp.cloudera.com/display/con/Cloudera+Connector+for+Teradata+Download>> >> . >> >> >> >> Regards, Kathleen >> >> >> >> On Wed, Jan 25, 2012 at 3:36 PM, Srinivas Surasani >> >> <[EMAIL PROTECTED]>wrote: >> >> >> >>> Hi Kathleen, >> >>> >> >>> Same issue with Teradata. >> >>> >> >>> >> >>> Srinivas -- >> >>> >> >>> >> >>> On Mon, Jan 23, 2012 at 8:26 PM, Kathleen Ting >> >>> <[EMAIL PROTECTED]>wrote: >> >>> >> >>>> Hi Weihua - >> >>>> >> >>>> Unfortunately, the generic jdbc manager does not support staging. >> >>>> As a result, I've filed >> >>>> https://issues.apache.org/jira/browse/SQOOP-431on your behalf. >> >>>> >> >>>> Regards, Kathleen >> >>>> >> >>>> >> >>>> On Mon, Jan 23, 2012 at 3:10 PM, Weihua Zhu <[EMAIL PROTECTED]> >> wrote:
-
Re: sqoop exporting from hdfs to postgresql, using --staging-table option
Srinivas Surasani 2012-01-28, 17:58
Abhijeet -- 1) By default it is loading 1000 records and as you mentioned earlier this can be tweaked using "sqoop.export.records.per.statement ". 2) I just run select count(*) on teradata table and seen 1000 records inserted at one go. 3) I believe setting number of mappers > 1 works for only non-parallel databases. -- Srinivas On Sat, Jan 28, 2012 at 12:47 PM, abhijeet gaikwad <[EMAIL PROTECTED]>wrote: > Hi Srinivas, > Export with multiple mappers is allowed in SQOOP. I have exported data > into Sql Server as well as MySql using multiple mappers. > > Regarding the issue you are facing, I have few questions: > 1. How did you set the batch size, 1000 you are talking about. > 2. Can you share SQOOP logs in detail? > 3. Deadlock issue seems to have raised by Teradata. There is an equal > probability that a Teradata admin will be able to resolve this issue. > > Thanks, > Abhijeet Gaikwad > On 28 Jan 2012 22:16, "Srinivas Surasani" <[EMAIL PROTECTED]> wrote: > >> Hi Abhijeet --, >> >> Thanks for the information. I have one more question. Is the exports is >> done always with one mapper? ( entering into table deadlocks if number of >> mappers set to more than one ). >> Also, FYI: I have observed the default number of rows inserted is 1000. >> >> Thanks, >> -- Srinvas >> >> On Sat, Jan 28, 2012 at 10:12 AM, abhijeet gaikwad < >> [EMAIL PROTECTED]> wrote: >> >>> Hi Srinivas, >>> Haven't played with Teradata Connector, but in general there are two >>> properties that drive insertions (SQOOP Export) in a table - namely: >>> >>> 1. "sqoop.export.records.per.statement" : This property is used to >>> specify the number of records/rows to be inserted using a single >>> INSERT statement. Default value is 100. >>> 2. "sqoop.export.statements.per.transaction" : This property is used >>> to specify the number the insert statements before a commit is fired - >>> which you can call batch size. Default value is 100. >>> >>> You can use -D hadoop argument to specify these properties at command >>> line. E.g. -Dsqoop.export.statements.per.transaction=50 >>> >>> NOTE: Make sure you use this argument(-D) before using any of the >>> SQOOP tool specific arguments. See SQOOP User Guide for more details. >>> >>> Thanks, >>> Abhijeet Gaikwad >>> >>> On 1/26/12, Srinivas Surasani <[EMAIL PROTECTED]> wrote: >>> > Kathleen, >>> > >>> > Any information on below request. >>> > >>> > Hi All, >>> > >>> > I'm working on Hadoop CDH3 U0 and Sqoop CDH3 U2. >>> > >>> > I'm trying to export csv files from HDFS to Teradata, it works well >>> with >>> > setting mapper to "1" ( with batch loading of 1000 records at a time ). >>> > when I tried increasing the number of mappers to more than one I'm >>> getting >>> > the following error. Also, is it possible to configure batch size at >>> the >>> > time of export ( from the command line)?? >>> > >>> > >>> > sqoop export --verbose --driver com.teradata.jdbc.TeraDriver >>> > --connect jdbc:teradata://xxxx/database=xxxx --username xxxxx >>> --password >>> > xxxxx --table xxxx --export-dir /user/surasani/10minutes.txt >>> > --fields-terminated-by '|' -m 4 --batch >>> > >>> > 12/01/24 16:17:21 INFO mapred.JobClient: map 3% reduce 0% >>> > 12/01/24 16:17:48 INFO mapred.JobClient: Task Id : >>> > attempt_201112211106_68553_m_000001_2, Status : FAILED >>> > *java.io.IOException: java.sql.BatchUpdateException: [Teradata >>> Database] >>> > [TeraJDBC 13.00.00.07] [Error 2631] [SQLState 40001] Transaction >>> ABORTed >>> > due to DeadLock*. >>> > >>> > Srinivas -- >>> > >>> > On Wed, Jan 25, 2012 at 8:01 PM, Kathleen Ting <[EMAIL PROTECTED] >>> >wrote: >>> > >>> >> Srinivas, as it happens, the Cloudera Connector for Teradata supports >>> >> staging tables. It is freely available here: >>> >> >>> https://ccp.cloudera.com/display/con/Cloudera+Connector+for+Teradata+Download>>> >> . >>> >> >>> >> Regards, Kathleen >>> >> >>> >> On Wed, Jan 25, 2012 at 3:36 PM, Srinivas Surasani >>> >> <[EMAIL PROTECTED]>wrote: >>
-
Re: sqoop exporting from hdfs to postgresql, using --staging-table option
abhijeet gaikwad 2012-01-30, 07:03
As to what I understanf: It is loading 1000 records because there are 1000 records in your input file. We cannot relate this to batch size because firing a select count(*) query will give you the total records inserted in the table - not the batches in which they were inserted.
Cannot comment on the last statement (#3) in your mail below - specially in context with Teradata. Open for discussion :)
Thanks, Abhijeet Gaikwad
On Sat, Jan 28, 2012 at 11:28 PM, Srinivas Surasani <[EMAIL PROTECTED]>wrote:
> > Abhijeet -- > > 1) By default it is loading 1000 records and as you mentioned earlier this > can be tweaked using "sqoop.export.records.per.statement ". > 2) I just run select count(*) on teradata table and seen 1000 records > inserted at one go. > 3) I believe setting number of mappers > 1 works for only non-parallel > databases. > > -- Srinivas > > > On Sat, Jan 28, 2012 at 12:47 PM, abhijeet gaikwad <[EMAIL PROTECTED] > > wrote: > >> Hi Srinivas, >> Export with multiple mappers is allowed in SQOOP. I have exported data >> into Sql Server as well as MySql using multiple mappers. >> >> Regarding the issue you are facing, I have few questions: >> 1. How did you set the batch size, 1000 you are talking about. >> 2. Can you share SQOOP logs in detail? >> 3. Deadlock issue seems to have raised by Teradata. There is an equal >> probability that a Teradata admin will be able to resolve this issue. >> >> Thanks, >> Abhijeet Gaikwad >> On 28 Jan 2012 22:16, "Srinivas Surasani" <[EMAIL PROTECTED]> wrote: >> >>> Hi Abhijeet --, >>> >>> Thanks for the information. I have one more question. Is the exports is >>> done always with one mapper? ( entering into table deadlocks if number of >>> mappers set to more than one ). >>> Also, FYI: I have observed the default number of rows inserted is 1000. >>> >>> Thanks, >>> -- Srinvas >>> >>> On Sat, Jan 28, 2012 at 10:12 AM, abhijeet gaikwad < >>> [EMAIL PROTECTED]> wrote: >>> >>>> Hi Srinivas, >>>> Haven't played with Teradata Connector, but in general there are two >>>> properties that drive insertions (SQOOP Export) in a table - namely: >>>> >>>> 1. "sqoop.export.records.per.statement" : This property is used to >>>> specify the number of records/rows to be inserted using a single >>>> INSERT statement. Default value is 100. >>>> 2. "sqoop.export.statements.per.transaction" : This property is used >>>> to specify the number the insert statements before a commit is fired - >>>> which you can call batch size. Default value is 100. >>>> >>>> You can use -D hadoop argument to specify these properties at command >>>> line. E.g. -Dsqoop.export.statements.per.transaction=50 >>>> >>>> NOTE: Make sure you use this argument(-D) before using any of the >>>> SQOOP tool specific arguments. See SQOOP User Guide for more details. >>>> >>>> Thanks, >>>> Abhijeet Gaikwad >>>> >>>> On 1/26/12, Srinivas Surasani <[EMAIL PROTECTED]> wrote: >>>> > Kathleen, >>>> > >>>> > Any information on below request. >>>> > >>>> > Hi All, >>>> > >>>> > I'm working on Hadoop CDH3 U0 and Sqoop CDH3 U2. >>>> > >>>> > I'm trying to export csv files from HDFS to Teradata, it works well >>>> with >>>> > setting mapper to "1" ( with batch loading of 1000 records at a time >>>> ). >>>> > when I tried increasing the number of mappers to more than one I'm >>>> getting >>>> > the following error. Also, is it possible to configure batch size at >>>> the >>>> > time of export ( from the command line)?? >>>> > >>>> > >>>> > sqoop export --verbose --driver com.teradata.jdbc.TeraDriver >>>> > --connect jdbc:teradata://xxxx/database=xxxx --username xxxxx >>>> --password >>>> > xxxxx --table xxxx --export-dir /user/surasani/10minutes.txt >>>> > --fields-terminated-by '|' -m 4 --batch >>>> > >>>> > 12/01/24 16:17:21 INFO mapred.JobClient: map 3% reduce 0% >>>> > 12/01/24 16:17:48 INFO mapred.JobClient: Task Id : >>>> > attempt_201112211106_68553_m_000001_2, Status : FAILED >>>> > *java.io.IOException: java.sql.BatchUpdateException: [Teradata
|
|