Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop, mail # user - sqoop exporting from hdfs to postgresql, using --staging-table option


Copy link to this message
-
Re: sqoop exporting from hdfs to postgresql, using --staging-table option
abhijeet gaikwad 2012-01-28, 15:12
Hi Srinivas,
Haven't played with Teradata Connector, but in general there are two
properties that drive insertions (SQOOP Export) in a table - namely:

1. "sqoop.export.records.per.statement" : This property is used to
specify the number of records/rows to be inserted using a single
INSERT statement. Default value is 100.
2. "sqoop.export.statements.per.transaction" : This property is used
to specify the number the insert statements before a commit is fired -
which you can call batch size. Default value is 100.

You can use -D hadoop argument to specify these properties at command
line. E.g. -Dsqoop.export.statements.per.transaction=50

NOTE: Make sure you use this argument(-D) before using any of the
SQOOP tool specific arguments. See SQOOP User Guide for more details.

Thanks,
Abhijeet Gaikwad

On 1/26/12, Srinivas Surasani <[EMAIL PROTECTED]> wrote:
> Kathleen,
>
> Any information on below request.
>
> Hi All,
>
> I'm working on Hadoop CDH3 U0 and  Sqoop CDH3 U2.
>
> I'm trying to export csv files from HDFS to Teradata, it works well with
> setting mapper to "1" ( with batch loading of 1000 records at a time ).
> when I tried increasing the number of mappers to more than one I'm getting
> the following error. Also, is it possible to configure batch size at the
> time of export ( from the command line)??
>
>
>  sqoop export  --verbose --driver com.teradata.jdbc.TeraDriver
> --connect jdbc:teradata://xxxx/database=xxxx  --username xxxxx --password
> xxxxx --table xxxx --export-dir /user/surasani/10minutes.txt
> --fields-terminated-by '|' -m 4 --batch
>
> 12/01/24 16:17:21 INFO mapred.JobClient:  map 3% reduce 0%
> 12/01/24 16:17:48 INFO mapred.JobClient: Task Id :
> attempt_201112211106_68553_m_000001_2, Status : FAILED
> *java.io.IOException: java.sql.BatchUpdateException: [Teradata Database]
> [TeraJDBC 13.00.00.07] [Error 2631] [SQLState 40001] Transaction ABORTed
> due to DeadLock*.
>
> Srinivas --
>
> On Wed, Jan 25, 2012 at 8:01 PM, Kathleen Ting <[EMAIL PROTECTED]>wrote:
>
>> Srinivas, as it happens, the Cloudera Connector for Teradata supports
>> staging tables. It is freely available here:
>> https://ccp.cloudera.com/display/con/Cloudera+Connector+for+Teradata+Download
>> .
>>
>> Regards, Kathleen
>>
>> On Wed, Jan 25, 2012 at 3:36 PM, Srinivas Surasani
>> <[EMAIL PROTECTED]>wrote:
>>
>>> Hi Kathleen,
>>>
>>> Same issue with Teradata.
>>>
>>>
>>> Srinivas --
>>>
>>>
>>> On Mon, Jan 23, 2012 at 8:26 PM, Kathleen Ting
>>> <[EMAIL PROTECTED]>wrote:
>>>
>>>> Hi Weihua -
>>>>
>>>> Unfortunately, the generic jdbc manager does not support staging.
>>>> As a result, I've filed
>>>> https://issues.apache.org/jira/browse/SQOOP-431on your behalf.
>>>>
>>>> Regards, Kathleen
>>>>
>>>>
>>>> On Mon, Jan 23, 2012 at 3:10 PM, Weihua Zhu <[EMAIL PROTECTED]> wrote:
>>>>
>>>>> Hi Guys,
>>>>>
>>>>>   Good afternoon!
>>>>>   I have a question. I was trying to sqoop exporting from hdfs to
>>>>> postgresql, using --staging-table options due to transactions
>>>>> consideration. But it gives me error below.
>>>>>   I am wondering if the staging_able is supported for
>>>>> GenericJdbcManager? if not, what kind of manager should I use?
>>>>>   Thanks very much!
>>>>>
>>>>>  -Weihua
>>>>>
>>>>> error message:
>>>>>
>>>>> 12/01/23 15:00:39 ERROR tool.ExportTool: Error during export: The
>>>>> active connection manager (org.apache.sqoop.manager.GenericJdbcManager)
>>>>> does not support staging of data for export. Please retry without
>>>>> specifying the --staging-table option.
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>