Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop, mail # user - Zero rows imported while doing Mysql to Hive import


Copy link to this message
-
Re: Zero rows imported while doing Mysql to Hive import
Siddharth Karandikar 2013-07-05, 13:47
Hi,

When I tried doing this import-mysql-into-hive in 3 separate steps, it
worked!

Here is what I ran -

1) Import MySQL table in HDFS with command:
ssk01:~/siddharth/tools/sqoop-1.4.3.bin__hadoop-1.0.0 # ./bin/sqoop import
--connect jdbc:mysql://localhost/ClassicModels -table Customers -m 1
--verbose --fields-terminated-by '\t' --lines-terminated-by '\n'

2) Create a table in Hive with query: (Using same query found in --verbose
log of failing sqoop run)
hive> CREATE TABLE IF NOT EXISTS `Customers` ( `customerNumber` INT,
`customerName` STRING, `contactLastName` STRING, `contactFirstName` STRING,
`phone` STRING, `addressLine1` STRING, `addressLine2` STRING, `city`
STRING, `state` STRING, `postalCode` STRING, `country` STRING,
`salesRepEmployeeNumber` INT, `creditLimit` DOUBLE) COMMENT 'Imported by
sqoop on 2013/07/04 00:41:14' ROW FORMAT DELIMITED FIELDS TERMINATED BY
'\t' LINES TERMINATED BY '\n' STORED AS TEXTFILE;
OK
Time taken: 0.214 seconds

3) Load data from HDFS with query: (Using same query found in --verbose log
of failing sqoop run)
hive> LOAD DATA INPATH 'hdfs://localhost:9000/user/root/Customers' INTO
TABLE `Customers`;
Loading data to table default.customers
Table default.customers stats: [num_partitions: 0, num_files: 2, num_rows:
0, total_size: 15556, raw_data_size: 0]
OK
Time taken: 0.317 seconds

hive> select count(*) from Customers;
.......
Total MapReduce CPU Time Spent: 2 seconds 460 msec
OK
122
Time taken: 26.873 seconds, Fetched: 1 row(s)
So the table got created and populated properly. I now wonder why it is not
working when I use --hive-import.
Any pointers to debug this issue?
Thanks,
Siddharth
On Thu, Jul 4, 2013 at 12:21 PM, Siddharth Karandikar <
[EMAIL PROTECTED]> wrote:

> Hi Jarek,
>
> I am have not re-configured Hive. I am using the default
> settings/locations. I am using --hive-home to tell sqoop where to find
> Hive.
>
> Here are the locations of my sqoop, Hive and Hadoop instances.
> Hadoop:    /root/siddharth/tools/hadoop-1.1.2
> Hive:    /root/siddharth/tools/hive-0.11.0-bin
> Sqoop:    /root/siddharth/tools/sqoop-1.4.3.bin__hadoop-1.0.0
>
>
> And here are few more details after running it with verbose.
>
> I am using following command to import into hive:
> ssk01:~/siddharth/tools/sqoop-1.4.3.bin__hadoop-1.0.0 # ./bin/sqoop
> import --connect jdbc:mysql://localhost/ClassicModels -table Customers
> -m 1 --hive-home /root/siddharth/tools/hive-0.11.0-bin --hive-import
> --verbose --mysql-delimiters
>
> Verbose output of above command:
> http://pastebin.com/TcYG8vkr
>
> After running this command here is what I see in Hive and HDFS
>
> HDFS
> ====> ssk01:~/siddharth/tools/hadoop-1.1.2 # bin/hadoop fs -ls
> hdfs://localhost:9000/user/hive/warehouse/*
> Found 2 items
> -rw-r--r--   1 root supergroup          0 2013-07-04 00:41
> /user/hive/warehouse/customers/_SUCCESS
> -rw-r--r--   1 root supergroup      15569 2013-07-04 00:41
> /user/hive/warehouse/customers/part-m-00000
>
>
> Hive (I am running Hive from its own directory so metadata should be
> accessible)
> ==========================================================> ssk01:~/siddharth/tools/hive-0.11.0-bin # ./bin/hive
>
> Logging initialized using configuration in
>
> jar:file:/root/siddharth/tools/hive-0.11.0-bin/lib/hive-common-0.11.0.jar!/hive-log4j.properties
> Hive history
> file=/tmp/root/[EMAIL PROTECTED]
> hive> show databases;
> OK
> default
> Time taken: 8.035 seconds, Fetched: 1 row(s)
>
> hive> use default;
> OK
> Time taken: 0.018 seconds
>
> hive> show tables;
> OK
> Time taken: 4.175 seconds
> hive>
>
> Strange thing is table named default.customers doesn't exist in Hive
> even though sqoop output mentioned that.
>
>
> Thanks,
> Siddharth
>
>
>
> On Wed, Jul 3, 2013 at 9:36 PM, Jarek Jarcec Cecho <[EMAIL PROTECTED]>
> wrote:
> > Hi Siddharth,
> > using directory in LOAD DATA command is completely valid. You can find
> more information about the command in Hive documentation [1]. I would