Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Sqoop, mail # user - Intermittent problems with sqoop using Oracle JDBC driver


+
Andre Araujo 2013-07-12, 05:57
+
Andre Araujo 2013-07-12, 15:05
+
Jarek Jarcec Cecho 2013-07-12, 15:35
+
Andre Araujo 2013-07-15, 00:29
+
Andre Araujo 2013-07-15, 03:12
Copy link to this message
-
Re: Intermittent problems with sqoop using Oracle JDBC driver
Jarek Jarcec Cecho 2013-07-15, 16:55
Hi Andre,
thank you for the blog post. Do you think that it would be helpful to put the additional information into the Sqoop Troubleshooting guide?

http://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html#_oracle_connection_reset_errors

Jarcec

On Mon, Jul 15, 2013 at 01:12:27PM +1000, Andre Araujo wrote:
> Thanks, David.
>
> My blog post is pending revision and should be published soon. I'll post
> the final link when it does.
> For the time being, please see below a copy of it without the formatting.
> What worked for me was a combination of two things:
>
>    - passing the "-D
>    mapred.child.java.opts="-Djava.security.egd=file:/dev/../dev/urandom""
>    parameter to sqoop
>    - setting the java.security.egd parameter in the HADOOP_OPTS variable,
>    so that it was passed to "${HADOOP_COMMON_HOME}/bin/hadoop
>     org.apache.sqoop.Sqoop"
>
> Regards,
> Andre
>
> -----------------------------------
>
> I’ve been using Sqoop to load data into HDFS from Oracle. I’m using version
> 1.4.3 of Sqoop, running on a Linux machine and using the Oracle JDBC driver
> with JDK 1.6.
>
> I was getting intermittent connection resets when trying to import data.
> After much troubleshooting, I eventually found the problem to be related to
> a known issue with the JDBC driver and found a way to work around it, which
> is described in the post
>
>
> The problem
>
> I noticed that when I was importing data at times where the machine I was
> running the sqoop client at was mostly idle, everything would run just
> fine. However, at times when others started to work on the same machine and
> it became a bit busier, I would start to get the errors below
> intermittently:
>
> [araujo@client ~]$ sqoop import --connect jdbc:oracle:thin:user/pwd@host/orcl
> -m 1 --query 'select 1 from dual where $CONDITIONS' --target-dir test
> 13/07/12 09:35:39 INFO manager.SqlManager: Using default fetchSize of 1000
> 13/07/12 09:35:39 INFO tool.CodeGenTool: Beginning code generation
> 13/07/12 09:37:53 ERROR manager.SqlManager: Error executing statement:
> java.sql.SQLRecoverableException: IO Error: Connection reset
>  at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:467)
> at oracle.jdbc.driver.PhysicalConnection.(PhysicalConnection.java:546)
>         ...
> Caused by: java.net.SocketException: Connection reset
> at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96)
> at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>  ... 24 more
> After some troubleshooting and googling, I found that the problem seemed to
> be related to the issue described in the following articles:
>
> http://stackoverflow.com/questions/2327220/oracle-jdbc-intermittent-connection-issue/
> https://forums.oracle.com/message/3701989/
>
> Confirming the problem
>
> To ensure the problem was the same as the one described in the articles,
> and not something else intrinsic to Sqoop, I created a small Java program
> that simply connected to the database. I was able to reproduce the issue
> using it:
>
> [araujo@client TestConn]$ time java TestConn
> Exception in thread "main" java.sql.SQLRecoverableException: IO Error:
> Connection reset
> ...
> Caused by: java.net.SocketException: Connection reset
> ...
> ... 8 more
>
> real 1m20.481s
> user 0m0.491s
> sys 0m0.051s
> The workaround suggested in the articles also worked:
>
> [araujo@client TestConn]$ time java
> -Djava.security.egd=file:/dev/../dev/urandom TestConn
> Connection successful!
>
> real 0m0.419s
> user 0m0.498s
> sys 0m0.036s
> Applying the fix to Sqoop
>
> It took me a while to figure out how to use the workaround above with
> Sqoop. Many tentatives to specify the parameter in the Sqoop command line,
> in many different forms, didn’t work as expected.
>
> The articles mention that the java.security.egd parameter can be centrally
> set in the $JAVA_HOME/jre/lib/security/java.security file. Unfortunately,
> this didn’t work for me. Using strace, I confirmed that Sqoop was actually
+
Andre Araujo 2013-07-15, 20:47
+
Jarek Jarcec Cecho 2013-07-15, 23:41
+
David Robson 2013-07-15, 03:31
+
Andre Araujo 2013-07-15, 12:34