Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - hi all

Copy link to this message
Re: hi all
Bejoy KS 2012-06-26, 08:14
Hi Shaik

On a first look, since you are using Dynamic Partition Insert, the partition column should be the last column on select query used in Insert Overwrite.
Modify your Insert as

INSERT OVERWRITE TABLE vender_part PARTITION (order_date) SELECT vender,supplier,quantity,order_date  FROM vender;

Once the insert job is complete verify your partitions

You can view your partitions in any table using

Show Paritions <TableName>; 
Bejoy KS

Sent from handheld, please excuse typos.

-----Original Message-----
From: shaik ahamed <[EMAIL PROTECTED]>
Date: Tue, 26 Jun 2012 13:12:28
Subject: hi all

Hi Users,
            As i created an hive table with the below syntax

CREATE EXTERNAL TABLE vender_part(vender string, supplier string,quantity
int ) PARTITIONED BY (order_date string) row format delimited fields
terminated by ',' stored as textfile;
And inserted the 100GB of data with the below command

vender,supplier,order_date,quantity  FROM vender;
then im getting the below output

Vendor_1 Supplier_111 2012-03-07 4240   NULL    NULL
Vendor_1 Supplier_112 2012-03-07 1237   NULL    NULL
Vendor_1 Supplier_113 2012-03-07 2970   NULL    NULL
Vendor_1 Supplier_114 2012-03-07 4652   NULL    NULL
Vendor_1 Supplier_115 2012-03-07 7414   NULL    NULL
Vendor_1 Supplier_116 2012-03-07 2334   NULL    NULL
Vendor_1 Supplier_117 2012-03-07 10522  NULL    NULL
Vendor_1 Supplier_118 2012-03-07 1776   NULL    NULL
Vendor_1 Supplier_119 2012-03-07 8344   NULL    NULL
Vendor_1 Supplier_120 2012-03-07 10362  NULL    NULL
Vendor_1 Supplier_121 2012-03-07 4579   NULL    NULL
Vendor_1 Supplier_122 2012-03-07 8020   NULL    NULL
Vendor_1 Supplier_123 2012-03-07 3520   NULL    NULL
Vendor_1 Supplier_124 2012-03-07 9124   NULL    NULL

please tell me that the above output is correct or not and why the 2
columns are null and there is a column with __HIVE_DEFAULT_PARTITION__

And if i select the partition table then the time taken to retrieve the
data should be less ,when compare to before partition right that not
happening for me.

Time taken for 100GB of data is : 2192.416 seconds

3.If i select the partition table order_date im not getting the data.

select * from vender_part where order_date='2012-03-07';

hive> select * from vender_part  where order_date='2012-03-07';
Time taken: 2.801 seconds

Please reply back to my above questions and help me out in going further
with the clear output who it will come when we do the hive table
 And why im not getting the data for the partitoned table if i select the
Thanks in advance