Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Sqoop >> mail # user >> sqoop exporting from hdfs to postgresql, using --staging-table option


+
Weihua Zhu 2012-01-23, 23:10
+
Kathleen Ting 2012-01-24, 01:26
+
Weihua Zhu 2012-01-25, 01:13
+
Srinivas Surasani 2012-01-25, 23:36
+
Kathleen Ting 2012-01-26, 01:01
+
Srinivas Surasani 2012-01-26, 01:39
+
Srinivas Surasani 2012-01-26, 01:43
+
abhijeet gaikwad 2012-01-28, 15:12
+
Srinivas Surasani 2012-01-28, 16:46
Copy link to this message
-
Re: sqoop exporting from hdfs to postgresql, using --staging-table option
Hi Srinivas,
Export with multiple mappers is allowed in SQOOP. I have exported data into
Sql Server as well as MySql using multiple mappers.

Regarding the issue you are facing, I have few questions:
1. How did you set the batch size, 1000 you are talking about.
2. Can you share SQOOP logs in detail?
3. Deadlock issue seems to have raised by Teradata. There is an equal
probability that a Teradata admin will be able to resolve this issue.

Thanks,
Abhijeet Gaikwad
On 28 Jan 2012 22:16, "Srinivas Surasani" <[EMAIL PROTECTED]> wrote:

> Hi Abhijeet --,
>
>  Thanks for the information. I have one more question. Is the exports is
> done always with one mapper? ( entering into table deadlocks if number of
> mappers set to more than one ).
> Also, FYI: I have observed the default number of rows inserted is 1000.
>
> Thanks,
> -- Srinvas
>
> On Sat, Jan 28, 2012 at 10:12 AM, abhijeet gaikwad <[EMAIL PROTECTED]
> > wrote:
>
>> Hi Srinivas,
>> Haven't played with Teradata Connector, but in general there are two
>> properties that drive insertions (SQOOP Export) in a table - namely:
>>
>> 1. "sqoop.export.records.per.statement" : This property is used to
>> specify the number of records/rows to be inserted using a single
>> INSERT statement. Default value is 100.
>> 2. "sqoop.export.statements.per.transaction" : This property is used
>> to specify the number the insert statements before a commit is fired -
>> which you can call batch size. Default value is 100.
>>
>> You can use -D hadoop argument to specify these properties at command
>> line. E.g. -Dsqoop.export.statements.per.transaction=50
>>
>> NOTE: Make sure you use this argument(-D) before using any of the
>> SQOOP tool specific arguments. See SQOOP User Guide for more details.
>>
>> Thanks,
>> Abhijeet Gaikwad
>>
>> On 1/26/12, Srinivas Surasani <[EMAIL PROTECTED]> wrote:
>> > Kathleen,
>> >
>> > Any information on below request.
>> >
>> > Hi All,
>> >
>> > I'm working on Hadoop CDH3 U0 and  Sqoop CDH3 U2.
>> >
>> > I'm trying to export csv files from HDFS to Teradata, it works well with
>> > setting mapper to "1" ( with batch loading of 1000 records at a time ).
>> > when I tried increasing the number of mappers to more than one I'm
>> getting
>> > the following error. Also, is it possible to configure batch size at the
>> > time of export ( from the command line)??
>> >
>> >
>> >  sqoop export  --verbose --driver com.teradata.jdbc.TeraDriver
>> > --connect jdbc:teradata://xxxx/database=xxxx  --username xxxxx
>> --password
>> > xxxxx --table xxxx --export-dir /user/surasani/10minutes.txt
>> > --fields-terminated-by '|' -m 4 --batch
>> >
>> > 12/01/24 16:17:21 INFO mapred.JobClient:  map 3% reduce 0%
>> > 12/01/24 16:17:48 INFO mapred.JobClient: Task Id :
>> > attempt_201112211106_68553_m_000001_2, Status : FAILED
>> > *java.io.IOException: java.sql.BatchUpdateException: [Teradata Database]
>> > [TeraJDBC 13.00.00.07] [Error 2631] [SQLState 40001] Transaction ABORTed
>> > due to DeadLock*.
>> >
>> > Srinivas --
>> >
>> > On Wed, Jan 25, 2012 at 8:01 PM, Kathleen Ting <[EMAIL PROTECTED]
>> >wrote:
>> >
>> >> Srinivas, as it happens, the Cloudera Connector for Teradata supports
>> >> staging tables. It is freely available here:
>> >>
>> https://ccp.cloudera.com/display/con/Cloudera+Connector+for+Teradata+Download
>> >> .
>> >>
>> >> Regards, Kathleen
>> >>
>> >> On Wed, Jan 25, 2012 at 3:36 PM, Srinivas Surasani
>> >> <[EMAIL PROTECTED]>wrote:
>> >>
>> >>> Hi Kathleen,
>> >>>
>> >>> Same issue with Teradata.
>> >>>
>> >>>
>> >>> Srinivas --
>> >>>
>> >>>
>> >>> On Mon, Jan 23, 2012 at 8:26 PM, Kathleen Ting
>> >>> <[EMAIL PROTECTED]>wrote:
>> >>>
>> >>>> Hi Weihua -
>> >>>>
>> >>>> Unfortunately, the generic jdbc manager does not support staging.
>> >>>> As a result, I've filed
>> >>>> https://issues.apache.org/jira/browse/SQOOP-431on your behalf.
>> >>>>
>> >>>> Regards, Kathleen
>> >>>>
>> >>>>
>> >>>> On Mon, Jan 23, 2012 at 3:10 PM, Weihua Zhu <[EMAIL PROTECTED]>
>> wrote:
+
Srinivas Surasani 2012-01-28, 17:58
+
abhijeet gaikwad 2012-01-30, 07:03