Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Sqoop, mail # user - SQOOP export from Hadoop Hive to MySQL


+
Han Sen Tey 2013-05-21, 12:56
Copy link to this message
-
Re: SQOOP export from Hadoop Hive to MySQL
Jarek Jarcec Cecho 2013-05-23, 08:45
Hi Han,
would you mind sharing entire Sqoop log generated with parameter --verbose as an attachment? It might be also beneficial to share the map task log.

Oozie is not using bash nor any other shell to execute Sqoop, so you should remove any escaping introduced especially for shell. Looking at your example, I think that it should be something like:

  ... --innput-fields-terminated-by | --input-null-string \\N ...

Jarcec

On Tue, May 21, 2013 at 05:56:16AM -0700, Han Sen Tey wrote:
> Greeting experts,
>
>    I have a table "CHRYSLER_Anametrix_Aggregated" in Hive and in MySQL.
> Similarly I can find this table in hdfs --
> /user/hive/warehouse/chrysler_anametrix_aggregated.
>
>
>    I can successfully execute the following SQOOP command from namenode CLI -
>
>    sqoop export --verbose --connect
> jdbc:mysql://ec2-23-22-223-47.compute-1.amazonaws.com:3306/test --table
> CHRYSLER_Anametrix_Aggregated --update-key
> VisitorID,VisitorDay,VisitorMonth,VisitorYear,VisitorSession
> --update-mode allowinsert --export-dir
> /user/hive/warehouse/chrysler_anametrix_aggregated --username root -m 1
> --input-fields-terminated-by '|' --input-null-string '\\N'
> --input-null-non-string '\\N'
>
>    However, when I want to run the above as part of Oozie workflow action ...
>
>
>     <action name="sqoop-node">
>         <sqoop xmlns="uri:oozie:sqoop-action:0.2">
>             <job-tracker>${jobTracker}</job-tracker>
>             <name-node>${nameNode}</name-node>
>            
>  <configuration>
>                 <property>
>                     <name>sqoop.connection.factories</name>
>                     <value>com.cloudera.sqoop.manager.DefaultManagerFactory</value>
>                 </property>
>                 <property>
>                    
>  <name>mapred.job.queue.name</name>
>                     <value>${queueName}</value>
>                 </property>
>                 <property>
>                     <name>mapred.compress.map.output</name>
>                     <value>true</value>
>                 </property>
>                
>  <property>
>                     <name>oozie.service.WorkflowAppService.system.libpath</name>
>                     <value>${nameNode}/user/oozie/share/lib/sqoop</value>
>                 </property>
>             </configuration>
>            
>  <command>export --verbose --connect
> jdbc:mysql://ec2-23-22-223-47.compute-1.amazonaws.com:3306/test --table
> CHRYSLER_Anametrix_Aggregated --update-key
> VisitorID,VisitorDay,VisitorMonth,VisitorYear,VisitorSession
> --update-mode allowinsert --export-dir
>  /user/hive/warehouse/chrysler_anametrix_aggregated --username root -m 1
>  --input-fields-terminated-by '|' --input-null-string '\\N'
> --input-null-non-string '\\N'</command>
>         </sqoop>
>         <ok to="end"/>
>         <error to="fail"/>
>     </action>
>
>    I get the following error message. I have "mysql-connector-java.5.1.24-bin.jar" shared library - /user/oozie/share/lib/sqoop.
>
>    Hadoop version - 2.00 - CDH4.2.0
>    Sqoop version - 1.4.2. - CDH4.2.0
>
>
>    Can you experts please guide me  ... a rookie to Hadoop and its projects.
>
> 2013-05-21 00:48:35,135 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead 2013-05-21 00:48:37,394 WARN org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id 2013-05-21 00:48:37,408 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId= 2013-05-21 00:48:38,042 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0 2013-05-21 00:48:38,058 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@5dce1bea 2013-05-21 00:48:38,423 INFO org.apache.hadoop.mapred.MapTask: Processing split: hdfs://ip-10-147-191-48.ec2.internal:8020/user/ubuntu/oozie-oozi/0000042-130520131050057-oozie-oozi-W/sqoop-node--sqoop/input/dummy.txt:0+5 2013-05-21 00:48:38,453 WARN mapreduce.Counters: Counter name MAP_INPUT_BYTES
+
Han Sen Tey 2013-05-23, 13:08
+
Jarek Jarcec Cecho 2013-05-24, 08:56