Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Sqoop >> mail # user >> Need help and tips for tthe following issue: No data get exported from hadoop to mysql using sqoop.


+
Matthieu Labour 2012-10-10, 16:30
+
Jarek Jarcec Cecho 2012-10-10, 16:40
+
Matthieu Labour 2012-10-10, 17:16
Copy link to this message
-
Re: Need help and tips for tthe following issue: No data get exported from hadoop to mysql using sqoop.
It would be very helpful if you could send us task log from one map job that Sqoop executes.

Blindly shooting - Sqoop is connecting to your database from map tasks. Based on the connection issues - are you sure that you can connect to your database from all nodes in your cluster?

Jarcec

On Wed, Oct 10, 2012 at 01:16:03PM -0400, Matthieu Labour wrote:
> Hi Jerek
>
> Thank you so much for your help.
>
> Following your advice, I run the following command:
> ~/sqoop-1.4.2.bin__hadoop-1.0.0/bin/sqoop export --connect
> jdbc:mysql://hostname:3306/analyticsdb --username username --password
> password --table ml_ys_log_gmt_test --export-dir
> hdfs:///mnt/var/lib/hadoop/dfs/logs_sanitized_test/dt=2012-10-01
> --input-fields-terminated-by='\t' --lines-terminated-by='\n' --verbose
>
> It seems to find the file to export. So that is good. In the log I see the
> following: (I am not sure why :0+52 gets appended)
> 2/10/10 16:43:41 DEBUG mapreduce.ExportInputFormat:
> Paths:/mnt/var/lib/hadoop/dfs/logs_sanitized_test/dt=2012-10-01/part-m-00000:0+52
> Locations:ip-XX-XX-XX-XXX.ec2.internal:;
>
> However it hangs forever after it printed the following:
> 12/10/10 16:43:42 INFO mapred.JobClient:  map 0% reduce 0%
>
> Then It seems the JDBC connection is eventually timing out.
> 12/10/10 16:47:07 INFO mapred.JobClient: Task Id :
> attempt_201210101503_0019_m_000000_0, Status : FAILED
>
> Here is the log towards the end:
>
> 12/10/10 16:43:40 INFO mapred.JobClient: Default number of map tasks: 4
> 12/10/10 16:43:40 INFO mapred.JobClient: Default number of reduce tasks: 0
> 12/10/10 16:43:41 INFO mapred.JobClient: Setting group to hadoop
> 12/10/10 16:43:41 INFO input.FileInputFormat: Total input paths to process
> : 1
> 12/10/10 16:43:41 DEBUG mapreduce.ExportInputFormat: Target numMapTasks=4
> 12/10/10 16:43:41 DEBUG mapreduce.ExportInputFormat: Total input bytes=52
> 12/10/10 16:43:41 DEBUG mapreduce.ExportInputFormat: maxSplitSize=13
> 12/10/10 16:43:41 INFO input.FileInputFormat: Total input paths to process
> : 1
> 12/10/10 16:43:41 DEBUG mapreduce.ExportInputFormat: Generated splits:
> 12/10/10 16:43:41 DEBUG mapreduce.ExportInputFormat:
> Paths:/mnt/var/lib/hadoop/dfs/logs_sanitized_test/dt=2012-10-01/part-m-00000:0+52
> Locations:ip-XX-XX-XX-XXX.ec2.internal:;
> 12/10/10 16:43:41 INFO mapred.JobClient: Running job: job_201210101503_0019
> 12/10/10 16:43:42 INFO mapred.JobClient:  map 0% reduce 0%
> 12/10/10 16:47:07 INFO mapred.JobClient: Task Id :
> attempt_201210101503_0019_m_000000_0, Status : FAILED
> java.io.IOException:
> com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications
> link failure
>
> The last packet sent successfully to the server was 0 milliseconds ago. The
> driver has not received any packets from the server.
>         at
> org.apache.sqoop.mapreduce.ExportOutputFormat.getRecordWriter(ExportOutputFormat.java:79)
>         at
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:635)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:760)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:375)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException:
> Communications link failure
>
>
>
>
> On Wed, Oct 10, 2012 at 12:40 PM, Jarek Jarcec Cecho <[EMAIL PROTECTED]>wrote:
>
> > Hi sir,
> > as far as I remember FileInputFormat is not doing recursive descent into
> > subdirectories when looking for input files. Would you mind trying to
> > export directory /mnt/var/lib/hadoop/dfs/logs_sanitized_test/dt=2012-10-01
> > to see if it will help? Something like
> >
> > sqoop export ... --export-dir
+
Matthieu Labour 2012-10-10, 18:06
+
Matthieu Labour 2012-10-10, 21:22
+
Jarek Jarcec Cecho 2012-10-10, 23:58
+
Matthieu Labour 2012-10-11, 14:39
+
Jarek Jarcec Cecho 2012-10-11, 15:38
+
Jarek Jarcec Cecho 2012-10-15, 21:40
+
Matthieu Labour 2012-10-17, 21:53
+
Jarek Jarcec Cecho 2012-10-17, 21:58
+
Matthieu Labour 2012-10-17, 22:01
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB