Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Sqoop, mail # user - Need help and tips for tthe following issue: No data get exported from hadoop to mysql using sqoop.


+
Matthieu Labour 2012-10-10, 16:30
+
Jarek Jarcec Cecho 2012-10-10, 16:40
+
Matthieu Labour 2012-10-10, 17:16
Copy link to this message
-
Re: Need help and tips for tthe following issue: No data get exported from hadoop to mysql using sqoop.
Jarek Jarcec Cecho 2012-10-10, 17:27
It would be very helpful if you could send us task log from one map job that Sqoop executes.

Blindly shooting - Sqoop is connecting to your database from map tasks. Based on the connection issues - are you sure that you can connect to your database from all nodes in your cluster?

Jarcec

On Wed, Oct 10, 2012 at 01:16:03PM -0400, Matthieu Labour wrote:
> Hi Jerek
>
> Thank you so much for your help.
>
> Following your advice, I run the following command:
> ~/sqoop-1.4.2.bin__hadoop-1.0.0/bin/sqoop export --connect
> jdbc:mysql://hostname:3306/analyticsdb --username username --password
> password --table ml_ys_log_gmt_test --export-dir
> hdfs:///mnt/var/lib/hadoop/dfs/logs_sanitized_test/dt=2012-10-01
> --input-fields-terminated-by='\t' --lines-terminated-by='\n' --verbose
>
> It seems to find the file to export. So that is good. In the log I see the
> following: (I am not sure why :0+52 gets appended)
> 2/10/10 16:43:41 DEBUG mapreduce.ExportInputFormat:
> Paths:/mnt/var/lib/hadoop/dfs/logs_sanitized_test/dt=2012-10-01/part-m-00000:0+52
> Locations:ip-XX-XX-XX-XXX.ec2.internal:;
>
> However it hangs forever after it printed the following:
> 12/10/10 16:43:42 INFO mapred.JobClient:  map 0% reduce 0%
>
> Then It seems the JDBC connection is eventually timing out.
> 12/10/10 16:47:07 INFO mapred.JobClient: Task Id :
> attempt_201210101503_0019_m_000000_0, Status : FAILED
>
> Here is the log towards the end:
>
> 12/10/10 16:43:40 INFO mapred.JobClient: Default number of map tasks: 4
> 12/10/10 16:43:40 INFO mapred.JobClient: Default number of reduce tasks: 0
> 12/10/10 16:43:41 INFO mapred.JobClient: Setting group to hadoop
> 12/10/10 16:43:41 INFO input.FileInputFormat: Total input paths to process
> : 1
> 12/10/10 16:43:41 DEBUG mapreduce.ExportInputFormat: Target numMapTasks=4
> 12/10/10 16:43:41 DEBUG mapreduce.ExportInputFormat: Total input bytes=52
> 12/10/10 16:43:41 DEBUG mapreduce.ExportInputFormat: maxSplitSize=13
> 12/10/10 16:43:41 INFO input.FileInputFormat: Total input paths to process
> : 1
> 12/10/10 16:43:41 DEBUG mapreduce.ExportInputFormat: Generated splits:
> 12/10/10 16:43:41 DEBUG mapreduce.ExportInputFormat:
> Paths:/mnt/var/lib/hadoop/dfs/logs_sanitized_test/dt=2012-10-01/part-m-00000:0+52
> Locations:ip-XX-XX-XX-XXX.ec2.internal:;
> 12/10/10 16:43:41 INFO mapred.JobClient: Running job: job_201210101503_0019
> 12/10/10 16:43:42 INFO mapred.JobClient:  map 0% reduce 0%
> 12/10/10 16:47:07 INFO mapred.JobClient: Task Id :
> attempt_201210101503_0019_m_000000_0, Status : FAILED
> java.io.IOException:
> com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications
> link failure
>
> The last packet sent successfully to the server was 0 milliseconds ago. The
> driver has not received any packets from the server.
>         at
> org.apache.sqoop.mapreduce.ExportOutputFormat.getRecordWriter(ExportOutputFormat.java:79)
>         at
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:635)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:760)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:375)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException:
> Communications link failure
>
>
>
>
> On Wed, Oct 10, 2012 at 12:40 PM, Jarek Jarcec Cecho <[EMAIL PROTECTED]>wrote:
>
> > Hi sir,
> > as far as I remember FileInputFormat is not doing recursive descent into
> > subdirectories when looking for input files. Would you mind trying to
> > export directory /mnt/var/lib/hadoop/dfs/logs_sanitized_test/dt=2012-10-01
> > to see if it will help? Something like
> >
> > sqoop export ... --export-dir
+
Matthieu Labour 2012-10-10, 18:06
+
Matthieu Labour 2012-10-10, 21:22
+
Jarek Jarcec Cecho 2012-10-10, 23:58
+
Matthieu Labour 2012-10-11, 14:39
+
Jarek Jarcec Cecho 2012-10-11, 15:38
+
Jarek Jarcec Cecho 2012-10-15, 21:40
+
Matthieu Labour 2012-10-17, 21:53
+
Jarek Jarcec Cecho 2012-10-17, 21:58
+
Matthieu Labour 2012-10-17, 22:01