Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Creating partitions causes Error in semantic analysis


Copy link to this message
-
RE: Creating partitions causes Error in semantic analysis
Hi Yongqiang,
 I am using the trunk code. I figured out what the problem was
INSERT OVERWRITE TABLE newtable
PARTITION (newdatestamp, myregion, myproperty)
SELECT
name,
age,
datestamp as newdatestamp,
region as myregion,
property as myproperty,
from oldtable where datestamp='20100525';

I need to specify the last 3 columns in the order of partitions, which I did not.

Meanwhile the dynamic partitioning produced a partition which was named "__HIVE_DEFAULT_PARTITION__". Is this created by default?

Thanks again for your help.
Viraj

-----Original Message-----
From: yongqiang he [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, June 16, 2010 5:46 PM
To: [EMAIL PROTECTED]
Subject: Re: Creating partitions causes Error in semantic analysis

Hive supports dynamic partition ( i think you need to use trunk code
for this feature.?).

here is an example:

set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.dynamic.partition=true;

create table if not exists nzhang_part1 like srcpart;
create table if not exists nzhang_part2 like srcpart;
describe extended nzhang_part1;

from srcpart
insert overwrite table nzhang_part1 partition (ds, hr) select key,
value, ds, hr where ds <= '2008-04-08'
insert overwrite table nzhang_part2 partition(ds='2008-12-31', hr)
select key, value, hr where ds > '2008-04-08';

On Wed, Jun 16, 2010 at 4:07 PM, Viraj Bhat <[EMAIL PROTECTED]> wrote:
> Hi all,
>
>   I have a table known as "oldtable" which is partitioned by datestamp.
>
>
>
> The schema of the "oldtable" is:
>
>
>
> name string
>
> age bigint
>
> property string
>
> region string
>
> datestamp string
>
>
>
>
>
> I now need to create a new table which is based of this old table and
> partitioned by (datestamp, region, property)
>
>
>
> The DDL for the new table looks like:
>
>
>
> CREATE EXTERNAL TABLE newtable
>
> (
>
> newname string,
>
> newage bigint,
>
> )
>
>
>
> PARTITIONED BY (newdatestamp STRING, myregion STRING, myproperty STRING)
>
>
>
> STORED AS RCFILE
>
> LOCATION '/user/viraj/rcfile;
>
>
>
>
>
> When I try to populate this new table from my old table, I try to use
> partitioning which uses values of old columns.
>
>
>
> INSERT OVERWRITE TABLE newtable
>
> PARTITION (newdatestamp='20100525', region, property)
>
> SELECT
>
>   name,
>
>   age
>
> from oldtable where datestamp='20100525';
>
>
>
> The above statement causes an error and expects hardcoded values for region
> and property.
>
>
>
> FAILED: Error in semantic analysis: Partition column in the partition
> specification does not exist.
>
>
>
> How do I specify the partition information such that the new tables, takes
> values from "property" and "region" from the old table and uses it as
> partitions.
>
>
>
> Is there a better way to achieve the above instead of hard coding values for
> each and every partition?
>
>
>
> ======================================================>
> Addendum: If the above is possible, how can I define some conditions where I
> need to say, If region is not "us" or "asia", put it in another partition
> known as misc?
>
> ======================================================>
>
>
>
>
> Thanks Viraj