|
|
-
question..partitioned table...
Ronak Bhatt 2010-08-21, 01:41
for the following create table, I'm getting the error : FAILED: Error in semantic analysis: Column repeated in partitioning columns... any suggestion? thanks, ronak create table callbacks_part ( ID INT, BOOKMARK_STEP_ID INT, CLIENT_ID INT, TXN_ID STRING, SESSION_ID STRING, IP_ADDRESS STRING, METHOD STRING, PAGE_ID INT, RULE_NAME_ID INT, CALLBACK_TYPE_ID INT, PAGE_ELEMENT_ID INT, CALLBACK_SENSE_ID INT, CALLBACK_INFO STRING, NUM_OCCURANCES INT, UNIQUE_ID STRING, EXEC_DATE STRING, EXEC_DURATION FLOAT ) PARTITIONED BY(EXEC_DATE STRING) CLUSTERED BY(PAGE_ID) SORTED BY(PAGE_ID) INTO 32 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE; *thanks, ronak* * * *408 504 4847* *My Blog : http://ronakbaps.posterous.com** * * *
-
Re: question..partitioned table...
Dilip Joseph 2010-08-21, 02:34
You don't need to specify "EXEC_DATE STRING," in your column definition as it is already specified by "PARTITIONED BY(EXEC_DATE STRING) Dilip On Fri, Aug 20, 2010 at 6:41 PM, Ronak Bhatt <[EMAIL PROTECTED]> wrote: > for the following create table, I'm getting the error : FAILED: Error in > semantic analysis: Column repeated in partitioning columns... > > any suggestion? > > thanks, ronak > > > create table callbacks_part > ( > ID INT, > BOOKMARK_STEP_ID INT, > CLIENT_ID INT, > TXN_ID STRING, > SESSION_ID STRING, > IP_ADDRESS STRING, > METHOD STRING, > PAGE_ID INT, > RULE_NAME_ID INT, > CALLBACK_TYPE_ID INT, > PAGE_ELEMENT_ID INT, > CALLBACK_SENSE_ID INT, > CALLBACK_INFO STRING, > NUM_OCCURANCES INT, > UNIQUE_ID STRING, > EXEC_DATE STRING, > EXEC_DURATION FLOAT > ) > PARTITIONED BY(EXEC_DATE STRING) > CLUSTERED BY(PAGE_ID) SORTED BY(PAGE_ID) INTO 32 BUCKETS > ROW FORMAT DELIMITED > FIELDS TERMINATED BY ',' > STORED AS TEXTFILE; > > > > > > *thanks, ronak* > * > * > *408 504 4847* > *My Blog : http://ronakbaps.posterous.com*> * > * > * > * > > -- _________________________________________ Dilip Antony Joseph http://csgrad.blogspot.comhttp://www.marydilip.info
-
Re: question..partitioned table...
Ronak Bhatt 2010-08-21, 02:58
I'm from oracle/sql background...the thinking there (in oracle) is - 1. one has a table with columns (c1, c2, c3) 2. define the partition on one of the columns (or combination of more than one) say c2 3. When data is inserted, it will be stored in right partition based on the value of c2. 4. when table is queries, if the WHERE predicate has partition column (c2 in this case), optimizer will go to that partition without user taking any specific action (e.g. without user specifying the partition) whereas, In the scenario below that I mentioned, I've a text file that has one of the fields as exec_date ....after some more research, I also learned that in HIVE partition cannot be on existing column of defined table e.g. I've to have a column name that is not part of table.. What I don't understand is that how do I write the queries so that I can utilize the partition if the partition column is not part of the table definition.. Any inputs would be greatly appreciated *thanks, ronak* * * *408 504 4847* *My Blog : http://ronakbaps.posterous.com** * * * On Fri, Aug 20, 2010 at 7:34 PM, Dilip Joseph <[EMAIL PROTECTED] > wrote: > You don't need to specify "EXEC_DATE STRING," in your column definition as > it is already specified by "PARTITIONED BY(EXEC_DATE STRING) > > Dilip > > > On Fri, Aug 20, 2010 at 6:41 PM, Ronak Bhatt <[EMAIL PROTECTED]> wrote: > >> for the following create table, I'm getting the error : FAILED: Error in >> semantic analysis: Column repeated in partitioning columns... >> >> any suggestion? >> >> thanks, ronak >> >> >> create table callbacks_part >> ( >> ID INT, >> BOOKMARK_STEP_ID INT, >> CLIENT_ID INT, >> TXN_ID STRING, >> SESSION_ID STRING, >> IP_ADDRESS STRING, >> METHOD STRING, >> PAGE_ID INT, >> RULE_NAME_ID INT, >> CALLBACK_TYPE_ID INT, >> PAGE_ELEMENT_ID INT, >> CALLBACK_SENSE_ID INT, >> CALLBACK_INFO STRING, >> NUM_OCCURANCES INT, >> UNIQUE_ID STRING, >> EXEC_DATE STRING, >> EXEC_DURATION FLOAT >> ) >> PARTITIONED BY(EXEC_DATE STRING) >> CLUSTERED BY(PAGE_ID) SORTED BY(PAGE_ID) INTO 32 BUCKETS >> ROW FORMAT DELIMITED >> FIELDS TERMINATED BY ',' >> STORED AS TEXTFILE; >> >> >> >> >> >> *thanks, ronak* >> * >> * >> *408 504 4847* >> *My Blog : http://ronakbaps.posterous.com*>> * >> * >> * >> * >> >> > > > -- > _________________________________________ > Dilip Antony Joseph > http://csgrad.blogspot.com> http://www.marydilip.info>
-
Re: question..partitioned table...
Dilip Joseph 2010-08-21, 07:45
The partition column is automatically included as part of the table definition, and hence can be used like any other column. So you can issue queries like SELECT c1, c2, c3 from callback_parts WHERE c2='foo'; for table defined as follows: create table callbacks_part ( c1 INT, c3 STRING ) PARTITIONED BY(c2 STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE; Dilip On Fri, Aug 20, 2010 at 7:58 PM, Ronak Bhatt <[EMAIL PROTECTED]> wrote: > I'm from oracle/sql background...the thinking there (in oracle) is - > > 1. one has a table with columns (c1, c2, c3) > 2. define the partition on one of the columns (or combination of more than > one) say c2 > 3. When data is inserted, it will be stored in right partition based on the > value of c2. > 4. when table is queries, if the WHERE predicate has partition column (c2 > in this case), optimizer will go to that partition without user taking any > specific action (e.g. without user specifying the partition) > > whereas, > > In the scenario below that I mentioned, I've a text file that has one of > the fields as exec_date ....after some more research, I also learned that in > HIVE partition cannot be on existing column of defined table e.g. I've to > have a column name that is not part of table.. > > What I don't understand is that how do I write the queries so that I can > utilize the partition if the partition column is not part of the table > definition.. > > Any inputs would be greatly appreciated > > > > *thanks, ronak* > * > * > *408 504 4847* > *My Blog : http://ronakbaps.posterous.com*> * > * > * > * > > > > On Fri, Aug 20, 2010 at 7:34 PM, Dilip Joseph < > [EMAIL PROTECTED]> wrote: > >> You don't need to specify "EXEC_DATE STRING," in your column definition as >> it is already specified by "PARTITIONED BY(EXEC_DATE STRING) >> >> Dilip >> >> >> On Fri, Aug 20, 2010 at 6:41 PM, Ronak Bhatt <[EMAIL PROTECTED]> wrote: >> >>> for the following create table, I'm getting the error : FAILED: Error in >>> semantic analysis: Column repeated in partitioning columns... >>> >>> any suggestion? >>> >>> thanks, ronak >>> >>> >>> create table callbacks_part >>> ( >>> ID INT, >>> BOOKMARK_STEP_ID INT, >>> CLIENT_ID INT, >>> TXN_ID STRING, >>> SESSION_ID STRING, >>> IP_ADDRESS STRING, >>> METHOD STRING, >>> PAGE_ID INT, >>> RULE_NAME_ID INT, >>> CALLBACK_TYPE_ID INT, >>> PAGE_ELEMENT_ID INT, >>> CALLBACK_SENSE_ID INT, >>> CALLBACK_INFO STRING, >>> NUM_OCCURANCES INT, >>> UNIQUE_ID STRING, >>> EXEC_DATE STRING, >>> EXEC_DURATION FLOAT >>> ) >>> PARTITIONED BY(EXEC_DATE STRING) >>> CLUSTERED BY(PAGE_ID) SORTED BY(PAGE_ID) INTO 32 BUCKETS >>> ROW FORMAT DELIMITED >>> FIELDS TERMINATED BY ',' >>> STORED AS TEXTFILE; >>> >>> >>> >>> >>> >>> *thanks, ronak* >>> * >>> * >>> *408 504 4847* >>> *My Blog : http://ronakbaps.posterous.com*>>> * >>> * >>> * >>> * >>> >>> >> >> >> -- >> _________________________________________ >> Dilip Antony Joseph >> http://csgrad.blogspot.com>> http://www.marydilip.info>> > > -- _________________________________________ Dilip Antony Joseph http://csgrad.blogspot.comhttp://www.marydilip.info
-
Re: question..partitioned table...
Ronak Bhatt 2010-08-21, 10:46
Thanks Dilip...this is useful...it is something different for a normal RDBMS experience... Is there any way to update the documentation with such explanation and may be an example? *thanks, ronak* * * On Sat, Aug 21, 2010 at 12:45 AM, Dilip Joseph < [EMAIL PROTECTED]> wrote: > The partition column is automatically included as part of the table > definition, and hence can be used like any other column. > > So you can issue queries like SELECT c1, c2, c3 from callback_parts WHERE > c2='foo'; for table defined as follows: > > create table callbacks_part > ( > c1 INT, > c3 STRING > ) > PARTITIONED BY(c2 STRING) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY ',' > STORED AS TEXTFILE; > > Dilip > > > On Fri, Aug 20, 2010 at 7:58 PM, Ronak Bhatt <[EMAIL PROTECTED]> wrote: > >> I'm from oracle/sql background...the thinking there (in oracle) is - >> >> 1. one has a table with columns (c1, c2, c3) >> 2. define the partition on one of the columns (or combination of more than >> one) say c2 >> 3. When data is inserted, it will be stored in right partition based on >> the value of c2. >> 4. when table is queries, if the WHERE predicate has partition column (c2 >> in this case), optimizer will go to that partition without user taking any >> specific action (e.g. without user specifying the partition) >> >> whereas, >> >> In the scenario below that I mentioned, I've a text file that has one of >> the fields as exec_date ....after some more research, I also learned that in >> HIVE partition cannot be on existing column of defined table e.g. I've to >> have a column name that is not part of table.. >> >> What I don't understand is that how do I write the queries so that I can >> utilize the partition if the partition column is not part of the table >> definition.. >> >> Any inputs would be greatly appreciated >> >> >> >> *thanks, ronak* >> * >> * >> *408 504 4847* >> *My Blog : http://ronakbaps.posterous.com*>> * >> * >> * >> * >> >> >> >> On Fri, Aug 20, 2010 at 7:34 PM, Dilip Joseph < >> [EMAIL PROTECTED]> wrote: >> >>> You don't need to specify "EXEC_DATE STRING," in your column definition >>> as it is already specified by "PARTITIONED BY(EXEC_DATE STRING) >>> >>> Dilip >>> >>> >>> On Fri, Aug 20, 2010 at 6:41 PM, Ronak Bhatt <[EMAIL PROTECTED]>wrote: >>> >>>> for the following create table, I'm getting the error : FAILED: Error in >>>> semantic analysis: Column repeated in partitioning columns... >>>> >>>> any suggestion? >>>> >>>> thanks, ronak >>>> >>>> >>>> create table callbacks_part >>>> ( >>>> ID INT, >>>> BOOKMARK_STEP_ID INT, >>>> CLIENT_ID INT, >>>> TXN_ID STRING, >>>> SESSION_ID STRING, >>>> IP_ADDRESS STRING, >>>> METHOD STRING, >>>> PAGE_ID INT, >>>> RULE_NAME_ID INT, >>>> CALLBACK_TYPE_ID INT, >>>> PAGE_ELEMENT_ID INT, >>>> CALLBACK_SENSE_ID INT, >>>> CALLBACK_INFO STRING, >>>> NUM_OCCURANCES INT, >>>> UNIQUE_ID STRING, >>>> EXEC_DATE STRING, >>>> EXEC_DURATION FLOAT >>>> ) >>>> PARTITIONED BY(EXEC_DATE STRING) >>>> CLUSTERED BY(PAGE_ID) SORTED BY(PAGE_ID) INTO 32 BUCKETS >>>> ROW FORMAT DELIMITED >>>> FIELDS TERMINATED BY ',' >>>> STORED AS TEXTFILE; >>>> >>>> >>>> >>>> >>>> >>>> *thanks, ronak* >>>> * >>>> * >>>> *408 504 4847* >>>> *My Blog : http://ronakbaps.posterous.com*>>>> * >>>> * >>>> * >>>> * >>>> >>>> >>> >>> >>> -- >>> _________________________________________ >>> Dilip Antony Joseph >>> http://csgrad.blogspot.com>>> http://www.marydilip.info>>> >> >> > > > -- > _________________________________________ > Dilip Antony Joseph > http://csgrad.blogspot.com> http://www.marydilip.info>
|
|