Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # user >> SQOOP export from Hadoop Hive to MySQL


Copy link to this message
-
Re: SQOOP export from Hadoop Hive to MySQL
Good morning Jarcec,

   Thank you very much for your insight. I would have never attempted to remove all single quote from my sqoop command. Upon removing all quotes from my sqoop command, my Oozie workflow that comprised of a pig-, hive- and sqoop- actions flow like a river.

   export --verbose --connect jdbc:mysql://ec2-23-22-223-47.compute-1.amazonaws.com:3306/test --table CHRYSLER_Anametrix_Aggregated --update-key VisitorID,VisitorDay,VisitorMonth,VisitorYear,VisitorSession --update-mode allowinsert --export-dir /user/hive/warehouse/chrysler_anametrix_aggregated --username root -m 1 --input-fields-terminated-by | --input-null-string \\N --input-null-non-string \\N
   One question to help with my future debugging. Where or how can I find the map task log in CDH4 ?

Thanks,
Andy

  

________________________________
 From: Jarek Jarcec Cecho <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]; Han Sen Tey <[EMAIL PROTECTED]>
Sent: Thursday, May 23, 2013 4:45:55 AM
Subject: Re: SQOOP export from Hadoop Hive to MySQL
 

Hi Han,
would you mind sharing entire Sqoop log generated with parameter --verbose as an attachment? It might be also beneficial to share the map task log.

Oozie is not using bash nor any other shell to execute Sqoop, so you should remove any escaping introduced especially for shell. Looking at your example, I think that it should be something like:

  ... --innput-fields-terminated-by | --input-null-string \\N ...

Jarcec

On Tue, May 21, 2013 at 05:56:16AM -0700, Han Sen Tey wrote:
> Greeting experts,
>
>    I have a table "CHRYSLER_Anametrix_Aggregated" in Hive and in MySQL.
> Similarly I can find this table in hdfs --
> /user/hive/warehouse/chrysler_anametrix_aggregated.
>
>
>    I can successfully execute the following SQOOP command from namenode CLI -
>
>    sqoop export --verbose --connect
> jdbc:mysql://ec2-23-22-223-47.compute-1.amazonaws.com:3306/test --table
> CHRYSLER_Anametrix_Aggregated --update-key
> VisitorID,VisitorDay,VisitorMonth,VisitorYear,VisitorSession
> --update-mode allowinsert --export-dir
> /user/hive/warehouse/chrysler_anametrix_aggregated --username root -m 1
> --input-fields-terminated-by '|' --input-null-string '\\N'
> --input-null-non-string '\\N'
>
>    However, when I want to run the above as part of Oozie workflow action ...
>
>
>     <action name="sqoop-node">
>         <sqoop xmlns="uri:oozie:sqoop-action:0.2">
>             <job-tracker>${jobTracker}</job-tracker>
>             <name-node>${nameNode}</name-node>
>            
>  <configuration>
>                 <property>
>                     <name>sqoop.connection.factories</name>
>                     <value>com.cloudera.sqoop.manager.DefaultManagerFactory</value>
>                 </property>
>                 <property>
>                    
>  <name>mapred.job.queue.name</name>
>                     <value>${queueName}</value>
>                 </property>
>                 <property>
>                     <name>mapred.compress.map.output</name>
>                     <value>true</value>
>                 </property>
>                
>  <property>
>                     <name>oozie.service.WorkflowAppService.system.libpath</name>
>                     <value>${nameNode}/user/oozie/share/lib/sqoop</value>
>                 </property>
>             </configuration>
>            
>  <command>export --verbose --connect
> jdbc:mysql://ec2-23-22-223-47.compute-1.amazonaws.com:3306/test --table
> CHRYSLER_Anametrix_Aggregated --update-key
> VisitorID,VisitorDay,VisitorMonth,VisitorYear,VisitorSession
> --update-mode allowinsert --export-dir
>  /user/hive/warehouse/chrysler_anametrix_aggregated --username root -m 1
>  --input-fields-terminated-by '|' --input-null-string '\\N'
> --input-null-non-string '\\N'</command>
>         </sqoop>
>         <ok to="end"/>
>         <error to="fail"/>
>     </action>
>
>    I get the following error message. I have "mysql-connector-java.5.1.24-bin.jar" shared library - /user/oozie/share/lib/sqoop.
 MAP_INPUT_BYTES
 00:48:40,140
 org.apache.sqoop.orm.ClassWriter: 
 2013-05-21
 DEBUG
 org.apache.sqoop.orm.ClassWriter: 
  -d
 /run/cloudera-scm-agent/process/76-mapreduce-TASKTRACKER:/usr/lib/jvm/j2sdk1.6-oracle/lib/tools.jar:/usr/lib/hadoop-0.20-mapreduce:/usr/lib/hadoop-0.20-mapreduce/hadoop-core-2.0.0-mr1-cdh4.2.0.jar:/usr/lib/hadoop-0.20-mapreduce/lib/activation-1.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20-mapreduce/lib/asm-3.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20-mapreduce/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20-mapreduce/lib/avro-1.7.3.jar:/usr/lib/hadoop-0.20-mapreduce/lib/avro-compiler-1.7.3.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-beanutils-1.7.0.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-beanutils-core-1.8.0.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-collections-3.2.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-configuration-1.6.jar:/usr/lib/hadoop-
 job_201305131705_0182
 is deprecated.
 org.apache.hadoop.mapred.TaskLogsTruncater: