|
|
-
Running Sqoop job from Oozie fails on create database
John Dasher 2012-11-27, 16:49
Hi,
I am attempting to run a sqoop job from oozie to load a table in Hive, incrementally. The oozie job errors with: "org.apache.hadoop.hive.ql.metadata.HiveException: javax.jdo.JDOFatalDataStoreException: Failed to create database '/var/lib/hive/metastore/metastore_db'"
We have hive set up to store the meta-data in a MySql database. So I'm lost trying to find out where/why it's trying to create a database in Derby. Any pointers or information is greatly appreciated. Thank you,
John We're using CDH4 (Free Edition):
Hadoop 2.0.0-cdh4.0.1
Oozie client build version: 3.1.3-cdh4.0.1
Sqoop 1.4.1-cdh4.0.1 Sqoop command and syslog below.
Sqoop command arguments : job --meta-connect jdbc:hsqldb:hsql://hadoopdw4:16000/sqoop --exec sq_admin_users_hive syslog logs
2012-11-27 14:40:13,395 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead 2012-11-27 14:40:13,617 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library 2012-11-27 14:40:14,048 WARN org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id 2012-11-27 14:40:14,049 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId2012-11-27 14:40:14,757 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0 2012-11-27 14:40:14,763 INFO org.apache.hadoop.mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@1ce3570c 2012-11-27 14:40:15,004 WARN org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library is available 2012-11-27 14:40:15,004 INFO org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library loaded 2012-11-27 14:40:15,011 WARN mapreduce.Counters: Counter name MAP_INPUT_BYTES is deprecated. Use FileInputFormatCounters as group name and BYTES_READ as counter name instead 2012-11-27 14:40:15,015 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 0 2012-11-27 14:40:15,549 WARN org.apache.sqoop.tool.SqoopTool: $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. 2012-11-27 14:40:15,950 WARN org.apache.sqoop.ConnFactory: $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. 2012-11-27 14:40:16,036 INFO org.apache.sqoop.manager.MySQLManager: Preparing to use a MySQL streaming resultset. 2012-11-27 14:40:16,044 INFO org.apache.sqoop.tool.CodeGenTool: Beginning code generation 2012-11-27 14:40:16,598 INFO org.apache.sqoop.manager.SqlManager: Executing SQL statement: SELECT t.* FROM `admin_users` AS t LIMIT 1 2012-11-27 14:40:16,640 INFO org.apache.sqoop.manager.SqlManager: Executing SQL statement: SELECT t.* FROM `admin_users` AS t LIMIT 1 2012-11-27 14:40:16,660 INFO org.apache.sqoop.orm.CompilationManager: HADOOP_HOME is /usr/lib/hadoop-0.20-mapreduce 2012-11-27 14:40:16,661 INFO org.apache.sqoop.orm.CompilationManager: Found hadoop core jar at: /usr/lib/hadoop-0.20-mapreduce/hadoop-core.jar 2012-11-27 14:40:20,544 INFO org.apache.sqoop.orm.CompilationManager: Writing jar file: /tmp/sqoop-mapred/compile/7fef46c7a9af683cd26c7cf826f91b6e/admin_users.jar 2012-11-27 14:40:20,600 INFO org.apache.sqoop.tool.ImportTool: Incremental import based on column `updated_at` 2012-11-27 14:40:20,602 INFO org.apache.sqoop.tool.ImportTool: Lower bound value: '2012-11-26 21:12:01.0' 2012-11-27 14:40:20,602 INFO org.apache.sqoop.tool.ImportTool: Upper bound value: '2012-11-27 14:40:20.0' 2012-11-27 14:40:20,602 WARN org.apache.sqoop.manager.MySQLManager: It looks like you are importing from mysql. 2012-11-27 14:40:20,604 WARN org.apache.sqoop.manager.MySQLManager: This transfer can be faster! Use the --direct 2012-11-27 14:40:20,604 WARN org.apache.sqoop.manager.MySQLManager: option to exercise a MySQL-specific fast path. 2012-11-27 14:40:20,604 INFO org.apache.sqoop.manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql) 2012-11-27 14:40:20,627 INFO org.apache.sqoop.mapreduce.ImportJobBase: Beginning import of admin_users 2012-11-27 14:40:20,698 WARN org.apache.sqoop.mapreduce.JobBase: SQOOP_HOME is unset. May not be able to find all job dependencies. 2012-11-27 14:40:21,017 WARN org.apache.hadoop.mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 2012-11-27 14:40:21,493 INFO org.apache.sqoop.mapreduce.db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`id`), MAX(`id`) FROM `admin_users` WHERE ( `updated_at` >= '2012-11-26 21:12:01.0' AND `updated_at` < '2012-11-27 14:40:20.0' ) 2012-11-27 14:40:21,942 INFO org.apache.hadoop.mapred.JobClient: Running job: job_201210230847_10616 2012-11-27 14:40:22,945 INFO org.apache.hadoop.mapred.JobClient: map 0% reduce 0% 2012-11-27 14:40:32,974 INFO org.apache.hadoop.mapred.JobClient: map 100% reduce 0% 2012-11-27 14:40:35,985 INFO org.apache.hadoop.mapred.JobClient: Job complete: job_201210230847_10616 2012-11-27 14:40:35,988 INFO org.apache.hadoop.mapred.JobClient: Counters: 23 2012-11-27 14:40:35,988 INFO org.apache.hadoop.mapred.JobClient: File System Counters 2012-11-27 14:40:35,997 INFO org.apache.hadoop.mapred.JobClient: FILE: Number of bytes read=0 2012-11-27 14:40:35,997 INFO org.apache.hadoop.mapred.JobClient: FILE: Number of bytes written=66125 2012-11-27 14:40:35,997 INFO org.apache.hadoop.mapred.JobClient: FILE: Number of read operations=0 2012-11-27 14:40:35,997 INFO org.apache.hadoop.mapred.JobClient: FILE: Number of large read operations=0 2012-11-27 14:40:35,998 INFO org.apache.hadoop.mapred.JobClient: FILE: Number of write operations=0 2012-11-27 14:40:35,998 INFO org.apache.hadoop.mapred.JobClient: HDFS: Number of bytes read=105 2012-11-27 14:40:35,998 INFO org.apache.hadoop.mapred.JobClient: HDFS: Number of bytes written=0 2012-11-2
-
Re: Running Sqoop job from Oozie fails on create database
Jarek Jarcec Cecho 2012-11-27, 17:04
Hi John, Sqoop is not supporting Hive integration when running from Oozie. Recommended workaround is to firstly run Sqoop import to temporary directory (no hive import) and than in separate Hive action load your data into Hive.
Jarcec
On Tue, Nov 27, 2012 at 04:49:59PM +0000, John Dasher wrote: > Hi, > > I am attempting to run a sqoop job from oozie to load a table in Hive, incrementally. The oozie job errors with: "org.apache.hadoop.hive.ql.metadata.HiveException: javax.jdo.JDOFatalDataStoreException: Failed to create database '/var/lib/hive/metastore/metastore_db'" > > We have hive set up to store the meta-data in a MySql database. So I'm lost trying to find out where/why it's trying to create a database in Derby. Any pointers or information is greatly appreciated. > > > Thank you, > > John > > > We're using CDH4 (Free Edition): > > Hadoop 2.0.0-cdh4.0.1 > > Oozie client build version: 3.1.3-cdh4.0.1 > > Sqoop 1.4.1-cdh4.0.1 > > > Sqoop command and syslog below. > > Sqoop command arguments : > job > --meta-connect > jdbc:hsqldb:hsql://hadoopdw4:16000/sqoop > --exec > sq_admin_users_hive > > > syslog logs > > > > 2012-11-27 14:40:13,395 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead > 2012-11-27 14:40:13,617 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library > 2012-11-27 14:40:14,048 WARN org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id > 2012-11-27 14:40:14,049 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId> 2012-11-27 14:40:14,757 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0 > 2012-11-27 14:40:14,763 INFO org.apache.hadoop.mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@1ce3570c > 2012-11-27 14:40:15,004 WARN org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library is available > 2012-11-27 14:40:15,004 INFO org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library loaded > 2012-11-27 14:40:15,011 WARN mapreduce.Counters: Counter name MAP_INPUT_BYTES is deprecated. Use FileInputFormatCounters as group name and BYTES_READ as counter name instead > 2012-11-27 14:40:15,015 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 0 > 2012-11-27 14:40:15,549 WARN org.apache.sqoop.tool.SqoopTool: $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. > 2012-11-27 14:40:15,950 WARN org.apache.sqoop.ConnFactory: $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. > 2012-11-27 14:40:16,036 INFO org.apache.sqoop.manager.MySQLManager: Preparing to use a MySQL streaming resultset. > 2012-11-27 14:40:16,044 INFO org.apache.sqoop.tool.CodeGenTool: Beginning code generation > 2012-11-27 14:40:16,598 INFO org.apache.sqoop.manager.SqlManager: Executing SQL statement: SELECT t.* FROM `admin_users` AS t LIMIT 1 > 2012-11-27 14:40:16,640 INFO org.apache.sqoop.manager.SqlManager: Executing SQL statement: SELECT t.* FROM `admin_users` AS t LIMIT 1 > 2012-11-27 14:40:16,660 INFO org.apache.sqoop.orm.CompilationManager: HADOOP_HOME is /usr/lib/hadoop-0.20-mapreduce > 2012-11-27 14:40:16,661 INFO org.apache.sqoop.orm.CompilationManager: Found hadoop core jar at: /usr/lib/hadoop-0.20-mapreduce/hadoop-core.jar > 2012-11-27 14:40:20,544 INFO org.apache.sqoop.orm.CompilationManager: Writing jar file: /tmp/sqoop-mapred/compile/7fef46c7a9af683cd26c7cf826f91b6e/admin_users.jar > 2012-11-27 14:40:20,600 INFO org.apache.sqoop.tool.ImportTool: Incremental import based on column `updated_at` > 2012-11-27 14:40:20,602 INFO org.apache.sqoop.tool.ImportTool: Lower bound value: '2012-11-26 21:12:01.0' > 2012-11-27 14:40:20,602 INFO org.apache.sqoop.tool.ImportTool: Upper bound value: '2012-11-27 14:40:20.0'
|
|