Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Data Loaded but Select returns nothing!


Copy link to this message
-
Re: Data Loaded but Select returns nothing!
Hi Kuldeep

The syntax doesn't look fine to me, to load data manually into
partitions you need two statements for 2 partitions
LOAD DATA LOCAL INPATH '/home/cloudera/CrimeHive_Albana.csv' INTO
TABLE crime_managed_native PARTITION (State='Alabama');
LOAD DATA LOCAL INPATH '/home/cloudera/CrimeHive_California.csv' INTO
TABLE crime_managed_native PARTITION (State='California');

Here the data corresponding to Carifornia and Alabama has to be in two
separate files and the files need to be loaded into respective
partitions. If the data is mixed up in file then you need to use
dynamic partition Insert

Load the source data into a non partitioned table first and from there
load into the actual partitioned table with .

https://cwiki.apache.org/Hive/dynamicpartitions.html

A sample scenario is here
http://kickstarthadoop.blogspot.in/2011/06/how-to-speed-up-your-hive-queries-in.html

To ensure that both partition and buckets work seamlessly in your case
load the source data into a non partitioned normal table from there
enable the required properties and load into the final partitioned
bucketed table.

Regards
Bejoy KS

Sent from handheld, please excuse typos.

-----Original Message-----
From: Kuldeep Chitrakar <[EMAIL PROTECTED]>
Date: Mon, 30 Jul 2012 08:02:16
To: [EMAIL PROTECTED]<[EMAIL PROTECTED]>; [EMAIL PROTECTED]<[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
Subject: RE: Data Loaded but Select returns nothing!

Hi Bejoy,

I modified the load command as below

LOAD DATA LOCAL INPATH '/home/cloudera/CrimeHive.csv' INTO TABLE crime_managed_native PARTITION (State='Alabama',State='California');

Now data is loaded however when I issue command as

Select *from crime_managed_native where State='Alabama'

No records are returned whereas (I have Alabama record in source file)

Select *from crime_managed_native where State='California'

It returns only California records.

Does that mean only California records got inserted in table. But I see that entire file is stored under /user/hive/warehouse/learn.db/crime_managed_native/State=California

and there is no directory for State=Alabama.

Also, what happens with rest of the records which do not have state as Alabama / California.

Do we have any documents which talks about partitioning in detail.

Thanks,
Kuldeep

From: Bejoy KS [mailto:[EMAIL PROTECTED]]
Sent: 30 July 2012 17:45
To: [EMAIL PROTECTED]
Subject: Re: Data Loaded but Select returns nothing!

Kuldeep

Couple of things I hoticed here are

Your table is bucketed, when you load data into a bucketed table you need to enable

hive.enforce.bucketing=true;

Bucketing needs a MR job so you need to load the non bucketed data into a normal table and from that load to a bucketed table using 'Insert Overwrite'.

Then another quick nit
Your table is partitioned so you need to load your data into some partition but you have not spefied a partition in Load.
Regards
Bejoy KS

Sent from handheld, please excuse typos.
________________________________
From: Kuldeep Chitrakar <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Date: Mon, 30 Jul 2012 06:58:33 -0500
To: [EMAIL PROTECTED]<[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]%[EMAIL PROTECTED]>>
ReplyTo: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
Subject: Data Loaded but Select returns nothing!

Hi

I am trying to load a CSV file into HIve table.

Everything works fine but when a fire "select * from tablename" command. It does not retun anything.

--Create Table

CREATE TABLE IF NOT EXISTS learn.crime_managed_native (
NoState String,
TypeofCrime String,
Crime String,
Year int,
Count int)
PARTITIONED BY (State String)
CLUSTERED BY (Crime) SORTED BY (Year ASC) INTO 8 BUCKETS
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE;

--Load Data

LOAD DATA LOCAL INPATH '/home/cloudera/CrimeHive.csv' INTO TABLE crime_managed_native;

What could be the possible issue.

Thanks,
Kuldeep
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB