Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Sqoop >> mail # user >> Zero rows imported while doing Mysql to Hive import


+
Siddharth Karandikar 2013-07-03, 11:25
+
Jarek Jarcec Cecho 2013-07-03, 16:01
+
Siddharth Karandikar 2013-07-03, 11:41
+
Jarek Jarcec Cecho 2013-07-03, 16:06
+
Siddharth Karandikar 2013-07-04, 06:51
+
Siddharth Karandikar 2013-07-05, 13:47
+
Siddharth Karandikar 2013-07-05, 14:06
+
Han Sen Tey 2013-07-05, 14:23
Copy link to this message
-
Re: Zero rows imported while doing Mysql to Hive import
Hi Andy,
I have sqoop installed in  '
/root/siddharth/tools/sqoop-1.4.3.bin__hadoop-1.0.0' and Hive in '
/root/siddharth/tools/hive-0.11.0-bin'.

I used to run following command from within sqoop's directory -
./bin/sqoop  import --connect jdbc:mysql://localhost/ClassicModels -table
Customers -m 1 --hive-home /root/siddharth/tools/hive-0.11.0-bin
--hive-import --verbose --mysql-delimiters

This would create temporary HDFS file of given mysql table and then will
start Hive from the same directory to create the table and load data.
Hive's default behavior is to look for metastore_db in current directory
and if not available, create it and use it. Above command used to run
without errors. So I used to go in Hive's directory '
/root/siddharth/tools/hive-0.11.0-bin' and run hive to check the newly
created table by sqoop and never find it there. Issue was, when Hive is
started from its own directory, it again checks for existence of
metastore_db which is not there (as it is create in sqoops dir), so creates
a new one which has no reference of previously created table.

What could have also worked in this case is, after running sqoop's command,
just copy metastore_db from there in Hive's directory before starting Hive
shell.
-
Siddharth

On Fri, Jul 5, 2013 at 7:53 PM, Han Sen Tey <[EMAIL PROTECTED]> wrote:

> Good day Siddharth,
>
>    Can you kindly elaborate further what the problem was and how do you
> resolve it ( configure location of metastore_db in hive ? ) ?  Thank you.
>
> Regards,
> Andy
>
>
>   ------------------------------
>  *From:* Siddharth Karandikar <[EMAIL PROTECTED]>
> *To:* [EMAIL PROTECTED]
> *Sent:* Friday, July 5, 2013 10:06:27 AM
> *Subject:* Re: Zero rows imported while doing Mysql to Hive import
>
> Hi Jarek,
>
> Problem solved. It was my error!
>
> I was rereading your response and looked closely into sqoop directory, I
> found a metastore_db directory there. Then I realized what you were talking
> about and what the problem was. My bad!
> As you mentioned, sqoop was doing its work perfectly fine, it was hive
> that was not knowing about new table getting created due to separate
> metastrore_db.
>
> Now I have configured location of metastore_db in hive and everything
> works great!  :)
>
> Thanks Jarek.
>
>
> Regards,
> Siddharth
>
>
>
>
> On Fri, Jul 5, 2013 at 7:17 PM, Siddharth Karandikar <
> [EMAIL PROTECTED]> wrote:
>
> Hi,
>
> When I tried doing this import-mysql-into-hive in 3 separate steps, it
> worked!
>
> Here is what I ran -
>
> 1) Import MySQL table in HDFS with command:
> ssk01:~/siddharth/tools/sqoop-1.4.3.bin__hadoop-1.0.0 # ./bin/sqoop import
> --connect jdbc:mysql://localhost/ClassicModels -table Customers -m 1
> --verbose --fields-terminated-by '\t' --lines-terminated-by '\n'
>
> 2) Create a table in Hive with query: (Using same query found in --verbose
> log of failing sqoop run)
> hive> CREATE TABLE IF NOT EXISTS `Customers` ( `customerNumber` INT,
> `customerName` STRING, `contactLastName` STRING, `contactFirstName` STRING,
> `phone` STRING, `addressLine1` STRING, `addressLine2` STRING, `city`
> STRING, `state` STRING, `postalCode` STRING, `country` STRING,
> `salesRepEmployeeNumber` INT, `creditLimit` DOUBLE) COMMENT 'Imported by
> sqoop on 2013/07/04 00:41:14' ROW FORMAT DELIMITED FIELDS TERMINATED BY
> '\t' LINES TERMINATED BY '\n' STORED AS TEXTFILE;
> OK
> Time taken: 0.214 seconds
>
> 3) Load data from HDFS with query: (Using same query found in --verbose
> log of failing sqoop run)
> hive> LOAD DATA INPATH 'hdfs://localhost:9000/user/root/Customers' INTO
> TABLE `Customers`;
> Loading data to table default.customers
>
> Table default.customers stats: [num_partitions: 0, num_files: 2, num_rows:
> 0, total_size: 15556, raw_data_size: 0]
> OK
> Time taken: 0.317 seconds
>
> hive> select count(*) from Customers;
> .......
> Total MapReduce CPU Time Spent: 2 seconds 460 msec
> OK
> 122
> Time taken: 26.873 seconds, Fetched: 1 row(s)
>
>
> So the table got created and populated properly. I now wonder why it is
+
Jarek Jarcec Cecho 2013-07-08, 15:35
+
Jarek Jarcec Cecho 2013-07-08, 15:30