Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Sqoop >> mail # user >> exporting data from sequence files back into an RDBMS


+
Eric Hernandez 2013-07-24, 21:54
+
Abraham Elmahrek 2013-07-24, 22:16
+
Eric Hernandez 2013-07-24, 22:28
+
Abraham Elmahrek 2013-07-24, 22:51
+
Eric Hernandez 2013-07-24, 23:19
+
Eric Hernandez 2013-07-25, 00:17
Copy link to this message
-
Re: exporting data from sequence files back into an RDBMS
Hi Eric,
would you mind sharing with us your entire data flow? Starting with the exact Sqoop import command, Hive transformations if you are doing any and finally with the Sqoop export command?

Importing data into Hive using the SequenceFile format is not supported by Sqoop, so I would like to make sure that we are understanding you use case correctly.

Jarcec

On Wed, Jul 24, 2013 at 05:17:30PM -0700, Eric Hernandez wrote:
> Here are my logs
>
> sqoop export --connect 'jdbc:mysql://mysqlServer:3306/hadoop' --username=hadoop -P --table=dbo_tablea --export-dir /hive/dbo_tablea -m 1 --input-fields-terminated-by  '\001'
> Enter password:
> 13/07/24 17:07:58 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
> 13/07/24 17:07:58 INFO tool.CodeGenTool: Beginning code generation
> 13/07/24 17:07:58 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dbo_tablea` AS t LIMIT 1
> 13/07/24 17:07:58 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dbo_tablea` AS t LIMIT 1
> 13/07/24 17:07:58 INFO orm.CompilationManager: HADOOP_HOME is /usr/lib/hadoop
> Note: /tmp/sqoop-erich/compile/5287b2ea7807ccef31ae33420fbbb7a0/dbo_tablea.java uses or overrides a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
> 13/07/24 17:08:00 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-erich/compile/5287b2ea7807ccef31ae33420fbbb7a0/dbo_tablea.jar
> 13/07/24 17:08:00 INFO mapreduce.ExportJobBase: Beginning export of dbo_tablea
> 13/07/24 17:08:02 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
> 13/07/24 17:08:02 INFO input.FileInputFormat: Total input paths to process : 1
> 13/07/24 17:08:02 INFO input.FileInputFormat: Total input paths to process : 1
> 13/07/24 17:08:03 INFO mapred.JobClient: Running job: job_201302261137_303267
> 13/07/24 17:08:04 INFO mapred.JobClient:  map 0% reduce 0%
> 13/07/24 17:08:20 INFO mapred.JobClient: Task Id : attempt_201302261137_303267_m_000000_0, Status : FAILED
> java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable cannot be cast to org.apache.hadoop.io.LongWritable
> at org.apache.sqoop.mapreduce.CombineShimRecordReader.getCurrentKey(CombineShimRecordReader.java:95)
> at org.apache.sqoop.mapreduce.CombineShimRecordReader.getCurrentKey(CombineShimRecordReader.java:38)
> at org.apache.sqoop.mapreduce.CombineFileRecordReader.getCurrentKey(CombineFileRecordReader.java:77)
> at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getCurrentKey(MapTask.java:436)
> at org.apache.hadoop.mapreduce.task.MapContextImpl.getCurrentKey(MapContextImpl.java:66)
> at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.getCurrentKey(WrappedMapper.java:75)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
> at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:182)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
> at org.apache.hadoop.mapred.Child$4.run(Child.ja
> 13/07/24 17:08:30 INFO mapred.JobClient: Task Id : attempt_201302261137_303267_m_000000_1, Status : FAILED
> java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable cannot be cast to org.apache.hadoop.io.LongWritable
> at org.apache.sqoop.mapreduce.CombineShimRecordReader.getCurrentKey(CombineShimRecordReader.java:95)
> at org.apache.sqoop.mapreduce.CombineShimRecordReader.getCurrentKey(CombineShimRecordReader.java:38)
> at org.apache.sqoop.mapreduce.CombineFileRecordReader.getCurrentKey(CombineFileRecordReader.java:77)
> at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getCurrentKey(MapTask.java:436)
> at org.apache.hadoop.mapreduce.task.MapContextImpl.getCurrentKey(MapContextImpl.java:66)
> at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.getCurrentKey(WrappedMapper.java:75)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
> at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:182)
+
Eric Hernandez 2013-07-25, 01:11