Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # user >> SQOOP export from Hadoop Hive to MySQL


Copy link to this message
-
Re: SQOOP export from Hadoop Hive to MySQL
Hi Andy,
I'm glad to hear that Sqoop has started working for you!

There is great article on Cloudera blog [1] that describes what sort of logs is Hadoop generating and how to get them. The task logs are describe there as well.

Jarcec

Links:
1: http://blog.cloudera.com/blog/2009/09/apache-hadoop-log-files-where-to-find-them-in-cdh-and-what-info-they-contain/

On Thu, May 23, 2013 at 06:08:47AM -0700, Han Sen Tey wrote:
> Good morning Jarcec,
>
>    Thank you very much for your insight. I would have never attempted to remove all single quote from my sqoop command. Upon removing all quotes from my sqoop command, my Oozie workflow that comprised of a pig-, hive- and sqoop- actions flow like a river.
>
>    export --verbose --connect jdbc:mysql://ec2-23-22-223-47.compute-1.amazonaws.com:3306/test --table CHRYSLER_Anametrix_Aggregated --update-key VisitorID,VisitorDay,VisitorMonth,VisitorYear,VisitorSession --update-mode allowinsert --export-dir /user/hive/warehouse/chrysler_anametrix_aggregated --username root -m 1 --input-fields-terminated-by | --input-null-string \\N --input-null-non-string \\N
>
>
>    One question to help with my future debugging. Where or how can I find the map task log in CDH4 ?
>
> Thanks,
> Andy
>
>   
>
>
>
> ________________________________
>  From: Jarek Jarcec Cecho <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]; Han Sen Tey <[EMAIL PROTECTED]>
> Sent: Thursday, May 23, 2013 4:45:55 AM
> Subject: Re: SQOOP export from Hadoop Hive to MySQL
>  
>
> Hi Han,
> would you mind sharing entire Sqoop log generated with parameter --verbose as an attachment? It might be also beneficial to share the map task log.
>
> Oozie is not using bash nor any other shell to execute Sqoop, so you should remove any escaping introduced especially for shell. Looking at your example, I think that it should be something like:
>
>   ... --innput-fields-terminated-by | --input-null-string \\N ...
>
> Jarcec
>
> On Tue, May 21, 2013 at 05:56:16AM -0700, Han Sen Tey wrote:
> > Greeting experts,
> >
> >    I have a table "CHRYSLER_Anametrix_Aggregated" in Hive and in MySQL.
> > Similarly I can find this table in hdfs --
> > /user/hive/warehouse/chrysler_anametrix_aggregated.
> >
> >
> >    I can successfully execute the following SQOOP command from namenode CLI -
> >
> >    sqoop export --verbose --connect
> > jdbc:mysql://ec2-23-22-223-47.compute-1.amazonaws.com:3306/test --table
> > CHRYSLER_Anametrix_Aggregated --update-key
> > VisitorID,VisitorDay,VisitorMonth,VisitorYear,VisitorSession
> > --update-mode allowinsert --export-dir
> > /user/hive/warehouse/chrysler_anametrix_aggregated --username root -m 1
> > --input-fields-terminated-by '|' --input-null-string '\\N'
> > --input-null-non-string '\\N'
> >
> >    However, when I want to run the above as part of Oozie workflow action ...
> >
> >
> >     <action name="sqoop-node">
> >         <sqoop xmlns="uri:oozie:sqoop-action:0.2">
> >             <job-tracker>${jobTracker}</job-tracker>
> >             <name-node>${nameNode}</name-node>
> >            
> >  <configuration>
> >                 <property>
> >                     <name>sqoop.connection.factories</name>
> >                     <value>com.cloudera.sqoop.manager.DefaultManagerFactory</value>
> >                 </property>
> >                 <property>
> >                    
> >  <name>mapred.job.queue.name</name>
> >                     <value>${queueName}</value>
> >                 </property>
> >                 <property>
> >                     <name>mapred.compress.map.output</name>
> >                     <value>true</value>
> >                 </property>
> >                
> >  <property>
> >                     <name>oozie.service.WorkflowAppService.system.libpath</name>
> >                     <value>${nameNode}/user/oozie/share/lib/sqoop</value>
> >                 </property>
> >             </configuration>
> >            
> >  <command>export --verbose --connect
> > jdbc:mysql://ec2-23-22-223-47.compute-1.amazonaws.com:3306/test --table