Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # user >> Zero rows imported while doing Mysql to Hive import


Copy link to this message
-
Re: Zero rows imported while doing Mysql to Hive import
Hi,

When I tried doing this import-mysql-into-hive in 3 separate steps, it
worked!

Here is what I ran -

1) Import MySQL table in HDFS with command:
ssk01:~/siddharth/tools/sqoop-1.4.3.bin__hadoop-1.0.0 # ./bin/sqoop import
--connect jdbc:mysql://localhost/ClassicModels -table Customers -m 1
--verbose --fields-terminated-by '\t' --lines-terminated-by '\n'

2) Create a table in Hive with query: (Using same query found in --verbose
log of failing sqoop run)
hive> CREATE TABLE IF NOT EXISTS `Customers` ( `customerNumber` INT,
`customerName` STRING, `contactLastName` STRING, `contactFirstName` STRING,
`phone` STRING, `addressLine1` STRING, `addressLine2` STRING, `city`
STRING, `state` STRING, `postalCode` STRING, `country` STRING,
`salesRepEmployeeNumber` INT, `creditLimit` DOUBLE) COMMENT 'Imported by
sqoop on 2013/07/04 00:41:14' ROW FORMAT DELIMITED FIELDS TERMINATED BY
'\t' LINES TERMINATED BY '\n' STORED AS TEXTFILE;
OK
Time taken: 0.214 seconds

3) Load data from HDFS with query: (Using same query found in --verbose log
of failing sqoop run)
hive> LOAD DATA INPATH 'hdfs://localhost:9000/user/root/Customers' INTO
TABLE `Customers`;
Loading data to table default.customers
Table default.customers stats: [num_partitions: 0, num_files: 2, num_rows:
0, total_size: 15556, raw_data_size: 0]
OK
Time taken: 0.317 seconds

hive> select count(*) from Customers;
.......
Total MapReduce CPU Time Spent: 2 seconds 460 msec
OK
122
Time taken: 26.873 seconds, Fetched: 1 row(s)
So the table got created and populated properly. I now wonder why it is not
working when I use --hive-import.
Any pointers to debug this issue?
Thanks,
Siddharth
On Thu, Jul 4, 2013 at 12:21 PM, Siddharth Karandikar <
[EMAIL PROTECTED]> wrote:

> Hi Jarek,
>
> I am have not re-configured Hive. I am using the default
> settings/locations. I am using --hive-home to tell sqoop where to find
> Hive.
>
> Here are the locations of my sqoop, Hive and Hadoop instances.
> Hadoop:    /root/siddharth/tools/hadoop-1.1.2
> Hive:    /root/siddharth/tools/hive-0.11.0-bin
> Sqoop:    /root/siddharth/tools/sqoop-1.4.3.bin__hadoop-1.0.0
>
>
> And here are few more details after running it with verbose.
>
> I am using following command to import into hive:
> ssk01:~/siddharth/tools/sqoop-1.4.3.bin__hadoop-1.0.0 # ./bin/sqoop
> import --connect jdbc:mysql://localhost/ClassicModels -table Customers
> -m 1 --hive-home /root/siddharth/tools/hive-0.11.0-bin --hive-import
> --verbose --mysql-delimiters
>
> Verbose output of above command:
> http://pastebin.com/TcYG8vkr
>
> After running this command here is what I see in Hive and HDFS
>
> HDFS
> ====> ssk01:~/siddharth/tools/hadoop-1.1.2 # bin/hadoop fs -ls
> hdfs://localhost:9000/user/hive/warehouse/*
> Found 2 items
> -rw-r--r--   1 root supergroup          0 2013-07-04 00:41
> /user/hive/warehouse/customers/_SUCCESS
> -rw-r--r--   1 root supergroup      15569 2013-07-04 00:41
> /user/hive/warehouse/customers/part-m-00000
>
>
> Hive (I am running Hive from its own directory so metadata should be
> accessible)
> ==========================================================> ssk01:~/siddharth/tools/hive-0.11.0-bin # ./bin/hive
>
> Logging initialized using configuration in
>
> jar:file:/root/siddharth/tools/hive-0.11.0-bin/lib/hive-common-0.11.0.jar!/hive-log4j.properties
> Hive history
> file=/tmp/root/[EMAIL PROTECTED]
> hive> show databases;
> OK
> default
> Time taken: 8.035 seconds, Fetched: 1 row(s)
>
> hive> use default;
> OK
> Time taken: 0.018 seconds
>
> hive> show tables;
> OK
> Time taken: 4.175 seconds
> hive>
>
> Strange thing is table named default.customers doesn't exist in Hive
> even though sqoop output mentioned that.
>
>
> Thanks,
> Siddharth
>
>
>
> On Wed, Jul 3, 2013 at 9:36 PM, Jarek Jarcec Cecho <[EMAIL PROTECTED]>
> wrote:
> > Hi Siddharth,
> > using directory in LOAD DATA command is completely valid. You can find
> more information about the command in Hive documentation [1]. I would
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB