Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Sqoop >> mail # user >> Intermittent problems with sqoop using Oracle JDBC driver


+
Andre Araujo 2013-07-12, 05:57
+
Andre Araujo 2013-07-12, 15:05
+
Jarek Jarcec Cecho 2013-07-12, 15:35
+
Andre Araujo 2013-07-15, 00:29
+
Andre Araujo 2013-07-15, 03:12
Copy link to this message
-
Re: Intermittent problems with sqoop using Oracle JDBC driver
Hi Andre,
thank you for the blog post. Do you think that it would be helpful to put the additional information into the Sqoop Troubleshooting guide?

http://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html#_oracle_connection_reset_errors

Jarcec

On Mon, Jul 15, 2013 at 01:12:27PM +1000, Andre Araujo wrote:
> Thanks, David.
>
> My blog post is pending revision and should be published soon. I'll post
> the final link when it does.
> For the time being, please see below a copy of it without the formatting.
> What worked for me was a combination of two things:
>
>    - passing the "-D
>    mapred.child.java.opts="-Djava.security.egd=file:/dev/../dev/urandom""
>    parameter to sqoop
>    - setting the java.security.egd parameter in the HADOOP_OPTS variable,
>    so that it was passed to "${HADOOP_COMMON_HOME}/bin/hadoop
>     org.apache.sqoop.Sqoop"
>
> Regards,
> Andre
>
> -----------------------------------
>
> I’ve been using Sqoop to load data into HDFS from Oracle. I’m using version
> 1.4.3 of Sqoop, running on a Linux machine and using the Oracle JDBC driver
> with JDK 1.6.
>
> I was getting intermittent connection resets when trying to import data.
> After much troubleshooting, I eventually found the problem to be related to
> a known issue with the JDBC driver and found a way to work around it, which
> is described in the post
>
>
> The problem
>
> I noticed that when I was importing data at times where the machine I was
> running the sqoop client at was mostly idle, everything would run just
> fine. However, at times when others started to work on the same machine and
> it became a bit busier, I would start to get the errors below
> intermittently:
>
> [araujo@client ~]$ sqoop import --connect jdbc:oracle:thin:user/pwd@host/orcl
> -m 1 --query 'select 1 from dual where $CONDITIONS' --target-dir test
> 13/07/12 09:35:39 INFO manager.SqlManager: Using default fetchSize of 1000
> 13/07/12 09:35:39 INFO tool.CodeGenTool: Beginning code generation
> 13/07/12 09:37:53 ERROR manager.SqlManager: Error executing statement:
> java.sql.SQLRecoverableException: IO Error: Connection reset
>  at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:467)
> at oracle.jdbc.driver.PhysicalConnection.(PhysicalConnection.java:546)
>         ...
> Caused by: java.net.SocketException: Connection reset
> at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96)
> at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>  ... 24 more
> After some troubleshooting and googling, I found that the problem seemed to
> be related to the issue described in the following articles:
>
> http://stackoverflow.com/questions/2327220/oracle-jdbc-intermittent-connection-issue/
> https://forums.oracle.com/message/3701989/
>
> Confirming the problem
>
> To ensure the problem was the same as the one described in the articles,
> and not something else intrinsic to Sqoop, I created a small Java program
> that simply connected to the database. I was able to reproduce the issue
> using it:
>
> [araujo@client TestConn]$ time java TestConn
> Exception in thread "main" java.sql.SQLRecoverableException: IO Error:
> Connection reset
> ...
> Caused by: java.net.SocketException: Connection reset
> ...
> ... 8 more
>
> real 1m20.481s
> user 0m0.491s
> sys 0m0.051s
> The workaround suggested in the articles also worked:
>
> [araujo@client TestConn]$ time java
> -Djava.security.egd=file:/dev/../dev/urandom TestConn
> Connection successful!
>
> real 0m0.419s
> user 0m0.498s
> sys 0m0.036s
> Applying the fix to Sqoop
>
> It took me a while to figure out how to use the workaround above with
> Sqoop. Many tentatives to specify the parameter in the Sqoop command line,
> in many different forms, didn’t work as expected.
>
> The articles mention that the java.security.egd parameter can be centrally
> set in the $JAVA_HOME/jre/lib/security/java.security file. Unfortunately,
> this didn’t work for me. Using strace, I confirmed that Sqoop was actually
+
Andre Araujo 2013-07-15, 20:47
+
Jarek Jarcec Cecho 2013-07-15, 23:41
+
David Robson 2013-07-15, 03:31
+
Andre Araujo 2013-07-15, 12:34
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB