|
|
-
sqoop question - could not post- the message came back undepliveredChalcy Raja 2012-03-29, 13:46
I am trying to do a sqoop export (data from hdfs hadoop to database). The table I am trying to export has 2 million rows. The table has 20 fields. The sqoop command is successful if I did 10 rows till 95 rows. When I try anything more than 95, the sqoop export fails with the following error.
By googling I get that this a dbms limitation. Is there anyway to configure to fix this error? I am surprised that it works for few rows. Any help is appreciated. Thanks, CHalcy 12/03/29 09:00:59 INFO mapred.JobClient: Task Id : attempt_201203230811_0539_m_000000_0, Status : FAILED java.io.IOException: com.microsoft.sqlserver.jdbc.SQLServerException: The incoming tabular data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. Too many parameters were provided in this RPC request. The maximum is 2100. at com.cloudera.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:189) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:540) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:649) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) at org.apache.hadoop.mapred.Child.main(Child.java:264) Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: The incoming tabular data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. 12/03/29 09:01:05 INFO mapred.JobClient: Task Id : attempt_201203230811_0539_m_000000_1, Status : FAILED java.io.IOException: com.microsoft.sqlserver.jdbc.SQLServerException: The incoming tabular data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. Too many parameters were provided in this RPC request. The maximum is 2100. at com.cloudera.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:189) -----Original Message----- From: Thiruvel Thirumoolan [mailto:[EMAIL PROTECTED]] Sent: Thursday, March 29, 2012 7:55 AM To: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: Executing query and storing output on HDFS This should help. https://cwiki.apache.org/Hive/languagemanual-dml.html#LanguageManualDML-Wri tingdataintofilesystemfromqueries On 3/29/12 4:48 PM, "Paul Ingles" <[EMAIL PROTECTED]> wrote: >Hi, > >I'd like to be able to execute a Hive query and for the output to be >stored in a path on HDFS (rather than immediately returned by the >client). Ultimately I'd like to be able to do this to integrate some of >our Hive statements and Cascading flows. > >Does anyone know if this is possible? I could have sworn it was but >can't find any mention of some OUTPUT TO clause on the Hive Wiki. > >Many thanks, >Paul |