Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop, mail # user - Need help and tips for tthe following issue: No data get exported from hadoop to mysql using sqoop.


Copy link to this message
-
Re: Need help and tips for tthe following issue: No data get exported from hadoop to mysql using sqoop.
Jarek Jarcec Cecho 2012-10-10, 23:58
Hi sir,
I have actually zero experience with amazon services, so I'm afraid that I can't much help you navigate to the map tasks logs. Usually on normal hadoop cluster, there is service call "Job Tracker" that is serving as central place for mapreduce jobs. I'm expecting that you should be able to find this webservice or something similar somehow somewhere. You should see job executed by hadoop there and you also should be able to get to individual task logs.

Following my previous blind shoot - How is defined MySQL user that you're using for Sqoop? I'm very interested to know the host part of the user. For example usually there are users like root@localhost or jarcec@'%'. If your host part (in my examples it's localhost or '%') is restrictive enough your hadoop nodes might not be capable of connecting to that MySQL box and thus resulting in connection failures.

Jarcec

On Wed, Oct 10, 2012 at 05:22:14PM -0400, Matthieu Labour wrote:
> Hi Jarcek
> If i use the postgresql jdbc connector and connect to one of our heroku
> machine then scoop works
> ~/$SQOOP_ROOT/bin/sqoop export --connect
> jdbc:postgresql://ec2-XX-XX-XXX-XX.compute-1.amazonaws.com:database
> --username username --password password --table ml_ys_log_gmt_test
> --export-dir -export-dir
> =hdfs:///mnt/var/lib/hadoop/dfs/logs_sanitized_test/dt=2012-10-01
> --input-fields-terminated-by='\t'
> --lines-terminated-by='\n' --verbose --batch
>
> On Wed, Oct 10, 2012 at 2:06 PM, Matthieu Labour <[EMAIL PROTECTED]>wrote:
>
> >
> > Jarcek
> >
> > I am quite new to hadoop and amazon EMR. Where are those files located?
> >
> > Here is what I am doing:
> >
> > 1) I am using amazon elastic map reduce and I have created a New Job that
> > does not terminate and whose type is HBase
> >
> > 2) I get the job id
> > myaccount@ubuntu:~/elastic-mapreduce-cli$ ./elastic-mapreduce --list
> > --active
> > j-3EFP15LBJC8R4     RUNNING
> > ec2-XXX-XX-XXX-XX.compute-1.amazonaws.com         sqooping
> >    COMPLETED      Setup Hadoop Debugging
> >    COMPLETED      Start HBase
> >    COMPLETED      Setup Hive
> >    RUNNING        Setup Pig
> >
> > 3) I attach and run a step:
> > ./elastic-mapreduce -j j-3EFP15LBJC8R4 --jar
> > s3://elasticmapreduce/libs/script-runner/script-runner.jar --arg
> > s3://mybucket/sqoop/sqoop.sh
> >
> > 4) I ssh the machine. ssh -i ~/.ec2/MYKEY.pem
> > [EMAIL PROTECTED]
> >
> > 5) tail -f /mnt/var/lib/hadoop/steps/6/stderr shows the mapreduce job
> > hanging
> > 12/10/10 17:46:58 DEBUG mapreduce.ExportInputFormat: Generated splits:
> > 12/10/10 17:46:58 DEBUG mapreduce.ExportInputFormat:
> > Paths:/mnt/var/lib/hadoop/dfs/logs_sanitized_test/dt=2012-10-01/part-m-00000:0+52
> > Locations:ip-10-77-70-192.ec2.internal:;
> > 12/10/10 17:46:58 INFO mapred.JobClient: Running job: job_201210101503_0024
> > 12/10/10 17:46:59 INFO mapred.JobClient:  map 0% reduce 0%
> >
> > 6) In /mnt/var/lib/hadoop/steps/6 there is the scoop.sh script file with
> > ~/sqoop-1.4.2.bin__hadoop-1.0.0/bin/sqoop export --connect
> > jdbc:mysql://hostname:3306/analyticsdb --username username --password
> > password --table ml_ys_log_gmt_test --export-dir
> > =hdfs:///mnt/var/lib/hadoop/dfs/logs_sanitized_test/dt=2012-10-01
> > --input-fields-termi
> > nated-by='\t' --lines-terminated-by='\n' --verbose --batch
> >
> > On that same machine, same location ( /mnt/var/lib/hadoop/steps/6), the
> > following command works
> > mysql -h hostname -P 3306 -u username -p
> > password: password
> > Afterwards I can use the database, describe the table etc ....
> > Please note the mysql machine is running on Amazon RDS and I have
> > added ElasticMapReduce-master security group to RDS
> >
> > Thank you for your help
> >
> >
> > On Wed, Oct 10, 2012 at 1:27 PM, Jarek Jarcec Cecho <[EMAIL PROTECTED]>wrote:
> >
> >> It would be very helpful if you could send us task log from one map job
> >> that Sqoop executes.
> >>
> >> Blindly shooting - Sqoop is connecting to your database from map tasks.