Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # user >> Zero rows imported while doing Mysql to Hive import

Copy link to this message
Re: Zero rows imported while doing Mysql to Hive import
Hi Jarek,

I am have not re-configured Hive. I am using the default
settings/locations. I am using --hive-home to tell sqoop where to find

Here are the locations of my sqoop, Hive and Hadoop instances.
Hadoop:    /root/siddharth/tools/hadoop-1.1.2
Hive:    /root/siddharth/tools/hive-0.11.0-bin
Sqoop:    /root/siddharth/tools/sqoop-1.4.3.bin__hadoop-1.0.0
And here are few more details after running it with verbose.

I am using following command to import into hive:
ssk01:~/siddharth/tools/sqoop-1.4.3.bin__hadoop-1.0.0 # ./bin/sqoop
import --connect jdbc:mysql://localhost/ClassicModels -table Customers
-m 1 --hive-home /root/siddharth/tools/hive-0.11.0-bin --hive-import
--verbose --mysql-delimiters

Verbose output of above command:

After running this command here is what I see in Hive and HDFS

====ssk01:~/siddharth/tools/hadoop-1.1.2 # bin/hadoop fs -ls
Found 2 items
-rw-r--r--   1 root supergroup          0 2013-07-04 00:41
-rw-r--r--   1 root supergroup      15569 2013-07-04 00:41
Hive (I am running Hive from its own directory so metadata should be accessible)
==========================================================ssk01:~/siddharth/tools/hive-0.11.0-bin # ./bin/hive

Logging initialized using configuration in
Hive history file=/tmp/root/[EMAIL PROTECTED]
hive> show databases;
Time taken: 8.035 seconds, Fetched: 1 row(s)

hive> use default;
Time taken: 0.018 seconds

hive> show tables;
Time taken: 4.175 seconds

Strange thing is table named default.customers doesn't exist in Hive
even though sqoop output mentioned that.

On Wed, Jul 3, 2013 at 9:36 PM, Jarek Jarcec Cecho <[EMAIL PROTECTED]> wrote:
> Hi Siddharth,
> using directory in LOAD DATA command is completely valid. You can find more information about the command in Hive documentation [1]. I would estimate that your issue might be more with parsing the data rather than accessing them when you are able to see the rows, just with incorrect values.
> Jarcec
> Links:
> 1: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
> On Wed, Jul 03, 2013 at 05:11:47PM +0530, Siddharth Karandikar wrote:
>> Hi,
>> While looking into Hive history file, I found this query.
>> LOAD DATA INPATH 'hdfs://localhost:9000/user/root/Customers' INTO
>> TABLE `Customers`"
>> QUERY_ID="root_20130703050909_882c2484-e1c8-43a3-9eff-dd0f296fc560"
>> .....
>> HDFS location mentioned in this query is a directory not a csv file.
>> This directory contains the part-* file(s) which hold actual data. I
>> don't know if Sqoop understands this directory structure and knows how
>> to read those multiple part-* files? Or is this an issue?
>> I was hit by a similar thing while creating an external table in Hive
>> where location specified was such hdfs directory (generated by sqoop
>> import) containing multiple part-* files. Hive table got created but
>> all the rows were NULL. And thats why I started looking into
>> --hive-import option available in sqoop. But looks like it is also not
>> working for me.
>> Am I missing something?
>> Thanks,
>> Siddharth
>> On Wed, Jul 3, 2013 at 4:55 PM, Siddharth Karandikar
>> <[EMAIL PROTECTED]> wrote:
>> > Hi,
>> >
>> > I am facing some problems while importing a sample database from MySQL
>> > to Hive using Sqoop 1.4.3, Hive 0.11.0 and Hadoop 1.1.2 on a single
>> > node setup.
>> >
>> > While doing this, I am always seeing following message in job logs -
>> > Table default.customers stats: [num_partitions: 0, num_files: 2,
>> > num_rows: 0, total_size: 15556, raw_data_size: 0]
>> >
>> > Job ends with success message -
>> > 13/07/03 05:09:30 INFO hive.HiveImport: Time taken: 0.74 seconds