|
Chunky Gupta
2012-11-05, 11:34
Dean Wampler
2012-11-05, 12:58
Chunky Gupta
2012-11-05, 14:19
Dean Wampler
2012-11-05, 14:33
Edward Capriolo
2012-11-05, 15:56
Mark Grover
2012-11-05, 17:38
Chunky Gupta
2012-11-06, 13:14
Mark Grover
2012-11-06, 17:08
Chunky Gupta
2012-11-07, 05:55
Mark Grover
2012-11-07, 06:28
Chunky Gupta
2012-11-07, 06:33
Mark Grover
2012-11-07, 06:52
Chunky Gupta
2012-11-07, 13:13
|
-
Alter table is giving errorChunky Gupta 2012-11-05, 11:34
Hi,
I am having a cluster setup on EC2 with Hadoop version 0.20.2 and Hive version 0.8.1 (I configured everything) . I have created a table using :- CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION 's3://my-location/data/'; Now I am trying to recover partition using :- ALTER TABLE XXX RECOVER PARTITIONS; but I am getting this error :- "FAILED: Parse Error: line 1:12 cannot recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table statement" Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3 and Hive version 0.8.1 (Configured by EMR), works fine. So is this a version issue or am I missing some configuration changes in EC2 setup ? I am not able to find exact solution for this problem on internet. Please help me. Thanks, Chunky. +
Chunky Gupta 2012-11-05, 11:34
-
Re: Alter table is giving errorDean Wampler 2012-11-05, 12:58
The RECOVER PARTITIONS is an enhancement added by Amazon to their version
of Hive. http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html <shameless-plus> Chapter 21 of Programming Hive discusses this feature and other aspects of using Hive in EMR. </shameless-plug> dean On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta <[EMAIL PROTECTED]>wrote: > Hi, > > I am having a cluster setup on EC2 with Hadoop version 0.20.2 and Hive > version 0.8.1 (I configured everything) . I have created a table using :- > > CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT > DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION 's3://my-location/data/'; > > Now I am trying to recover partition using :- > > ALTER TABLE XXX RECOVER PARTITIONS; > > but I am getting this error :- "FAILED: Parse Error: line 1:12 cannot > recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table statement" > > Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3 and > Hive version 0.8.1 (Configured by EMR), works fine. > > So is this a version issue or am I missing some configuration changes in > EC2 setup ? > I am not able to find exact solution for this problem on internet. Please > help me. > > Thanks, > Chunky. > > > > -- *Dean Wampler, Ph.D.* thinkbiganalytics.com +1-312-339-1330 +
Dean Wampler 2012-11-05, 12:58
-
Re: Alter table is giving errorChunky Gupta 2012-11-05, 14:19
Hi Dean,
Actually I was having Hadoop and Hive cluster on EMR and I have S3 storage containing logs which updates daily and having partition with date(dt). And I was using this recover partition. Now I wanted to shift to EC2 and have my own Hadoop and Hive cluster. So, what is the alternate of using recover partition in this case, if you have any idea ? I found one way of individually partitioning all dates, so I have to write script for that to do so for all dates. Is there any easiest way other than this ? Thanks, Chunky On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler < [EMAIL PROTECTED]> wrote: > The RECOVER PARTITIONS is an enhancement added by Amazon to their version > of Hive. > > > http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html > > <shameless-plus> > Chapter 21 of Programming Hive discusses this feature and other aspects > of using Hive in EMR. > </shameless-plug> > > dean > > > On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta <[EMAIL PROTECTED]>wrote: > >> Hi, >> >> I am having a cluster setup on EC2 with Hadoop version 0.20.2 and Hive >> version 0.8.1 (I configured everything) . I have created a table using :- >> >> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT >> DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION 's3://my-location/data/'; >> >> Now I am trying to recover partition using :- >> >> ALTER TABLE XXX RECOVER PARTITIONS; >> >> but I am getting this error :- "FAILED: Parse Error: line 1:12 cannot >> recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table statement" >> >> Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3 and >> Hive version 0.8.1 (Configured by EMR), works fine. >> >> So is this a version issue or am I missing some configuration changes in >> EC2 setup ? >> I am not able to find exact solution for this problem on internet. Please >> help me. >> >> Thanks, >> Chunky. >> >> >> >> > > > -- > *Dean Wampler, Ph.D.* > thinkbiganalytics.com > +1-312-339-1330 > > > +
Chunky Gupta 2012-11-05, 14:19
-
Re: Alter table is giving errorDean Wampler 2012-11-05, 14:33
Writing a script to add the external partitions individually is the only way I know of.
Sent from my rotary phone. On Nov 5, 2012, at 8:19 AM, Chunky Gupta <[EMAIL PROTECTED]> wrote: > Hi Dean, > > Actually I was having Hadoop and Hive cluster on EMR and I have S3 storage containing logs which updates daily and having partition with date(dt). And I was using this recover partition. > Now I wanted to shift to EC2 and have my own Hadoop and Hive cluster. So, what is the alternate of using recover partition in this case, if you have any idea ? > I found one way of individually partitioning all dates, so I have to write script for that to do so for all dates. Is there any easiest way other than this ? > > Thanks, > Chunky > > > > On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler <[EMAIL PROTECTED]> wrote: >> The RECOVER PARTITIONS is an enhancement added by Amazon to their version of Hive. >> >> http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html >> >> <shameless-plus> >> Chapter 21 of Programming Hive discusses this feature and other aspects of using Hive in EMR. >> </shameless-plug> >> >> dean >> >> >> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta <[EMAIL PROTECTED]> wrote: >>> Hi, >>> >>> I am having a cluster setup on EC2 with Hadoop version 0.20.2 and Hive version 0.8.1 (I configured everything) . I have created a table using :- >>> >>> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION 's3://my-location/data/'; >>> >>> Now I am trying to recover partition using :- >>> >>> ALTER TABLE XXX RECOVER PARTITIONS; >>> >>> but I am getting this error :- "FAILED: Parse Error: line 1:12 cannot recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table statement" >>> >>> Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3 and Hive version 0.8.1 (Configured by EMR), works fine. >>> >>> So is this a version issue or am I missing some configuration changes in EC2 setup ? >>> I am not able to find exact solution for this problem on internet. Please help me. >>> >>> Thanks, >>> Chunky. >> >> >> >> -- >> Dean Wampler, Ph.D. >> thinkbiganalytics.com >> +1-312-339-1330 > +
Dean Wampler 2012-11-05, 14:33
-
Re: Alter table is giving errorEdward Capriolo 2012-11-05, 15:56
Recover partitions should work the same way for different file systems.
Edward On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler <[EMAIL PROTECTED]> wrote: > Writing a script to add the external partitions individually is the only way > I know of. > > Sent from my rotary phone. > > > On Nov 5, 2012, at 8:19 AM, Chunky Gupta <[EMAIL PROTECTED]> wrote: > > Hi Dean, > > Actually I was having Hadoop and Hive cluster on EMR and I have S3 storage > containing logs which updates daily and having partition with date(dt). And > I was using this recover partition. > Now I wanted to shift to EC2 and have my own Hadoop and Hive cluster. So, > what is the alternate of using recover partition in this case, if you have > any idea ? > I found one way of individually partitioning all dates, so I have to write > script for that to do so for all dates. Is there any easiest way other than > this ? > > Thanks, > Chunky > > > > On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler > <[EMAIL PROTECTED]> wrote: >> >> The RECOVER PARTITIONS is an enhancement added by Amazon to their version >> of Hive. >> >> >> http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html >> >> <shameless-plus> >> Chapter 21 of Programming Hive discusses this feature and other aspects >> of using Hive in EMR. >> </shameless-plug> >> >> dean >> >> >> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta <[EMAIL PROTECTED]> >> wrote: >>> >>> Hi, >>> >>> I am having a cluster setup on EC2 with Hadoop version 0.20.2 and Hive >>> version 0.8.1 (I configured everything) . I have created a table using :- >>> >>> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT >>> DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION 's3://my-location/data/'; >>> >>> Now I am trying to recover partition using :- >>> >>> ALTER TABLE XXX RECOVER PARTITIONS; >>> >>> but I am getting this error :- "FAILED: Parse Error: line 1:12 cannot >>> recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table statement" >>> >>> Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3 and >>> Hive version 0.8.1 (Configured by EMR), works fine. >>> >>> So is this a version issue or am I missing some configuration changes in >>> EC2 setup ? >>> I am not able to find exact solution for this problem on internet. Please >>> help me. >>> >>> Thanks, >>> Chunky. >>> >>> >>> >> >> >> >> -- >> Dean Wampler, Ph.D. >> thinkbiganalytics.com >> +1-312-339-1330 >> >> > +
Edward Capriolo 2012-11-05, 15:56
-
Re: Alter table is giving errorMark Grover 2012-11-05, 17:38
Chunky,
I have used "recover partitions" command on EMR, and that worked fine. However, take a look at https://issues.apache.org/jira/browse/HIVE-874. Seems like msck command in Apache Hive does the same thing. Try it out and let us know it goes. Mark On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo <[EMAIL PROTECTED]>wrote: > Recover partitions should work the same way for different file systems. > > Edward > > On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler > <[EMAIL PROTECTED]> wrote: > > Writing a script to add the external partitions individually is the only > way > > I know of. > > > > Sent from my rotary phone. > > > > > > On Nov 5, 2012, at 8:19 AM, Chunky Gupta <[EMAIL PROTECTED]> > wrote: > > > > Hi Dean, > > > > Actually I was having Hadoop and Hive cluster on EMR and I have S3 > storage > > containing logs which updates daily and having partition with date(dt). > And > > I was using this recover partition. > > Now I wanted to shift to EC2 and have my own Hadoop and Hive cluster. So, > > what is the alternate of using recover partition in this case, if you > have > > any idea ? > > I found one way of individually partitioning all dates, so I have to > write > > script for that to do so for all dates. Is there any easiest way other > than > > this ? > > > > Thanks, > > Chunky > > > > > > > > On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler > > <[EMAIL PROTECTED]> wrote: > >> > >> The RECOVER PARTITIONS is an enhancement added by Amazon to their > version > >> of Hive. > >> > >> > >> > http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html > >> > >> <shameless-plus> > >> Chapter 21 of Programming Hive discusses this feature and other > aspects > >> of using Hive in EMR. > >> </shameless-plug> > >> > >> dean > >> > >> > >> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta <[EMAIL PROTECTED]> > >> wrote: > >>> > >>> Hi, > >>> > >>> I am having a cluster setup on EC2 with Hadoop version 0.20.2 and Hive > >>> version 0.8.1 (I configured everything) . I have created a table using > :- > >>> > >>> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT > >>> DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION 's3://my-location/data/'; > >>> > >>> Now I am trying to recover partition using :- > >>> > >>> ALTER TABLE XXX RECOVER PARTITIONS; > >>> > >>> but I am getting this error :- "FAILED: Parse Error: line 1:12 cannot > >>> recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table > statement" > >>> > >>> Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3 > and > >>> Hive version 0.8.1 (Configured by EMR), works fine. > >>> > >>> So is this a version issue or am I missing some configuration changes > in > >>> EC2 setup ? > >>> I am not able to find exact solution for this problem on internet. > Please > >>> help me. > >>> > >>> Thanks, > >>> Chunky. > >>> > >>> > >>> > >> > >> > >> > >> -- > >> Dean Wampler, Ph.D. > >> thinkbiganalytics.com > >> +1-312-339-1330 > >> > >> > > > +
Mark Grover 2012-11-05, 17:38
-
Re: Alter table is giving errorChunky Gupta 2012-11-06, 13:14
Hi Mark,
I tried msck, but it is not working for me. I have written a python script to partition the data individually. Thank you Edward, Mark and Dean. Chunky. On Mon, Nov 5, 2012 at 11:08 PM, Mark Grover <[EMAIL PROTECTED]>wrote: > Chunky, > I have used "recover partitions" command on EMR, and that worked fine. > > However, take a look at https://issues.apache.org/jira/browse/HIVE-874. Seems > like msck command in Apache Hive does the same thing. Try it out and let us > know it goes. > > Mark > > On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo <[EMAIL PROTECTED]>wrote: > >> Recover partitions should work the same way for different file systems. >> >> Edward >> >> On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler >> <[EMAIL PROTECTED]> wrote: >> > Writing a script to add the external partitions individually is the >> only way >> > I know of. >> > >> > Sent from my rotary phone. >> > >> > >> > On Nov 5, 2012, at 8:19 AM, Chunky Gupta <[EMAIL PROTECTED]> >> wrote: >> > >> > Hi Dean, >> > >> > Actually I was having Hadoop and Hive cluster on EMR and I have S3 >> storage >> > containing logs which updates daily and having partition with date(dt). >> And >> > I was using this recover partition. >> > Now I wanted to shift to EC2 and have my own Hadoop and Hive cluster. >> So, >> > what is the alternate of using recover partition in this case, if you >> have >> > any idea ? >> > I found one way of individually partitioning all dates, so I have to >> write >> > script for that to do so for all dates. Is there any easiest way other >> than >> > this ? >> > >> > Thanks, >> > Chunky >> > >> > >> > >> > On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler >> > <[EMAIL PROTECTED]> wrote: >> >> >> >> The RECOVER PARTITIONS is an enhancement added by Amazon to their >> version >> >> of Hive. >> >> >> >> >> >> >> http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html >> >> >> >> <shameless-plus> >> >> Chapter 21 of Programming Hive discusses this feature and other >> aspects >> >> of using Hive in EMR. >> >> </shameless-plug> >> >> >> >> dean >> >> >> >> >> >> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta <[EMAIL PROTECTED]> >> >> wrote: >> >>> >> >>> Hi, >> >>> >> >>> I am having a cluster setup on EC2 with Hadoop version 0.20.2 and Hive >> >>> version 0.8.1 (I configured everything) . I have created a table >> using :- >> >>> >> >>> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT >> >>> DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION >> 's3://my-location/data/'; >> >>> >> >>> Now I am trying to recover partition using :- >> >>> >> >>> ALTER TABLE XXX RECOVER PARTITIONS; >> >>> >> >>> but I am getting this error :- "FAILED: Parse Error: line 1:12 cannot >> >>> recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table >> statement" >> >>> >> >>> Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3 >> and >> >>> Hive version 0.8.1 (Configured by EMR), works fine. >> >>> >> >>> So is this a version issue or am I missing some configuration changes >> in >> >>> EC2 setup ? >> >>> I am not able to find exact solution for this problem on internet. >> Please >> >>> help me. >> >>> >> >>> Thanks, >> >>> Chunky. >> >>> >> >>> >> >>> >> >> >> >> >> >> >> >> -- >> >> Dean Wampler, Ph.D. >> >> thinkbiganalytics.com >> >> +1-312-339-1330 >> >> >> >> >> > >> > > +
Chunky Gupta 2012-11-06, 13:14
-
Re: Alter table is giving errorMark Grover 2012-11-06, 17:08
Glad to hear, Chunky.
Out of curiosity, what errors did you get when using msck? On Tue, Nov 6, 2012 at 5:14 AM, Chunky Gupta <[EMAIL PROTECTED]>wrote: > Hi Mark, > I tried msck, but it is not working for me. I have written a python script > to partition the data individually. > > Thank you Edward, Mark and Dean. > Chunky. > > > On Mon, Nov 5, 2012 at 11:08 PM, Mark Grover <[EMAIL PROTECTED]>wrote: > >> Chunky, >> I have used "recover partitions" command on EMR, and that worked fine. >> >> However, take a look at https://issues.apache.org/jira/browse/HIVE-874. Seems >> like msck command in Apache Hive does the same thing. Try it out and let us >> know it goes. >> >> Mark >> >> On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo <[EMAIL PROTECTED]>wrote: >> >>> Recover partitions should work the same way for different file systems. >>> >>> Edward >>> >>> On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler >>> <[EMAIL PROTECTED]> wrote: >>> > Writing a script to add the external partitions individually is the >>> only way >>> > I know of. >>> > >>> > Sent from my rotary phone. >>> > >>> > >>> > On Nov 5, 2012, at 8:19 AM, Chunky Gupta <[EMAIL PROTECTED]> >>> wrote: >>> > >>> > Hi Dean, >>> > >>> > Actually I was having Hadoop and Hive cluster on EMR and I have S3 >>> storage >>> > containing logs which updates daily and having partition with >>> date(dt). And >>> > I was using this recover partition. >>> > Now I wanted to shift to EC2 and have my own Hadoop and Hive cluster. >>> So, >>> > what is the alternate of using recover partition in this case, if you >>> have >>> > any idea ? >>> > I found one way of individually partitioning all dates, so I have to >>> write >>> > script for that to do so for all dates. Is there any easiest way other >>> than >>> > this ? >>> > >>> > Thanks, >>> > Chunky >>> > >>> > >>> > >>> > On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler >>> > <[EMAIL PROTECTED]> wrote: >>> >> >>> >> The RECOVER PARTITIONS is an enhancement added by Amazon to their >>> version >>> >> of Hive. >>> >> >>> >> >>> >> >>> http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html >>> >> >>> >> <shameless-plus> >>> >> Chapter 21 of Programming Hive discusses this feature and other >>> aspects >>> >> of using Hive in EMR. >>> >> </shameless-plug> >>> >> >>> >> dean >>> >> >>> >> >>> >> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta <[EMAIL PROTECTED] >>> > >>> >> wrote: >>> >>> >>> >>> Hi, >>> >>> >>> >>> I am having a cluster setup on EC2 with Hadoop version 0.20.2 and >>> Hive >>> >>> version 0.8.1 (I configured everything) . I have created a table >>> using :- >>> >>> >>> >>> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT >>> >>> DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION >>> 's3://my-location/data/'; >>> >>> >>> >>> Now I am trying to recover partition using :- >>> >>> >>> >>> ALTER TABLE XXX RECOVER PARTITIONS; >>> >>> >>> >>> but I am getting this error :- "FAILED: Parse Error: line 1:12 cannot >>> >>> recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table >>> statement" >>> >>> >>> >>> Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3 >>> and >>> >>> Hive version 0.8.1 (Configured by EMR), works fine. >>> >>> >>> >>> So is this a version issue or am I missing some configuration >>> changes in >>> >>> EC2 setup ? >>> >>> I am not able to find exact solution for this problem on internet. >>> Please >>> >>> help me. >>> >>> >>> >>> Thanks, >>> >>> Chunky. >>> >>> >>> >>> >>> >>> >>> >> >>> >> >>> >> >>> >> -- >>> >> Dean Wampler, Ph.D. >>> >> thinkbiganalytics.com >>> >> +1-312-339-1330 >>> >> >>> >> >>> > >>> >> >> > +
Mark Grover 2012-11-06, 17:08
-
Re: Alter table is giving errorChunky Gupta 2012-11-07, 05:55
Hi Mark,
I didn't get any error. I ran this on hive console:- "msck table Table_Name;" It says Ok and showed the execution time as 1.050 sec. But when I checked partitions for table using "show partitions Table_Name;" It didn't show me any partitions. Thanks, Chunky. On Tue, Nov 6, 2012 at 10:38 PM, Mark Grover <[EMAIL PROTECTED]>wrote: > Glad to hear, Chunky. > > Out of curiosity, what errors did you get when using msck? > > > On Tue, Nov 6, 2012 at 5:14 AM, Chunky Gupta <[EMAIL PROTECTED]>wrote: > >> Hi Mark, >> I tried msck, but it is not working for me. I have written a python >> script to partition the data individually. >> >> Thank you Edward, Mark and Dean. >> Chunky. >> >> >> On Mon, Nov 5, 2012 at 11:08 PM, Mark Grover <[EMAIL PROTECTED] >> > wrote: >> >>> Chunky, >>> I have used "recover partitions" command on EMR, and that worked fine. >>> >>> However, take a look at https://issues.apache.org/jira/browse/HIVE-874. Seems >>> like msck command in Apache Hive does the same thing. Try it out and let us >>> know it goes. >>> >>> Mark >>> >>> On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo <[EMAIL PROTECTED]>wrote: >>> >>>> Recover partitions should work the same way for different file systems. >>>> >>>> Edward >>>> >>>> On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler >>>> <[EMAIL PROTECTED]> wrote: >>>> > Writing a script to add the external partitions individually is the >>>> only way >>>> > I know of. >>>> > >>>> > Sent from my rotary phone. >>>> > >>>> > >>>> > On Nov 5, 2012, at 8:19 AM, Chunky Gupta <[EMAIL PROTECTED]> >>>> wrote: >>>> > >>>> > Hi Dean, >>>> > >>>> > Actually I was having Hadoop and Hive cluster on EMR and I have S3 >>>> storage >>>> > containing logs which updates daily and having partition with >>>> date(dt). And >>>> > I was using this recover partition. >>>> > Now I wanted to shift to EC2 and have my own Hadoop and Hive cluster. >>>> So, >>>> > what is the alternate of using recover partition in this case, if you >>>> have >>>> > any idea ? >>>> > I found one way of individually partitioning all dates, so I have to >>>> write >>>> > script for that to do so for all dates. Is there any easiest way >>>> other than >>>> > this ? >>>> > >>>> > Thanks, >>>> > Chunky >>>> > >>>> > >>>> > >>>> > On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler >>>> > <[EMAIL PROTECTED]> wrote: >>>> >> >>>> >> The RECOVER PARTITIONS is an enhancement added by Amazon to their >>>> version >>>> >> of Hive. >>>> >> >>>> >> >>>> >> >>>> http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html >>>> >> >>>> >> <shameless-plus> >>>> >> Chapter 21 of Programming Hive discusses this feature and other >>>> aspects >>>> >> of using Hive in EMR. >>>> >> </shameless-plug> >>>> >> >>>> >> dean >>>> >> >>>> >> >>>> >> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta < >>>> [EMAIL PROTECTED]> >>>> >> wrote: >>>> >>> >>>> >>> Hi, >>>> >>> >>>> >>> I am having a cluster setup on EC2 with Hadoop version 0.20.2 and >>>> Hive >>>> >>> version 0.8.1 (I configured everything) . I have created a table >>>> using :- >>>> >>> >>>> >>> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT >>>> >>> DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION >>>> 's3://my-location/data/'; >>>> >>> >>>> >>> Now I am trying to recover partition using :- >>>> >>> >>>> >>> ALTER TABLE XXX RECOVER PARTITIONS; >>>> >>> >>>> >>> but I am getting this error :- "FAILED: Parse Error: line 1:12 >>>> cannot >>>> >>> recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table >>>> statement" >>>> >>> >>>> >>> Doing same steps on a cluster setup on EMR with Hadoop version >>>> 1.0.3 and >>>> >>> Hive version 0.8.1 (Configured by EMR), works fine. >>>> >>> >>>> >>> So is this a version issue or am I missing some configuration >>>> changes in >>>> >>> EC2 setup ? >>>> >>> I am not able to find exact solution for this problem on internet. +
Chunky Gupta 2012-11-07, 05:55
-
Re: Alter table is giving errorMark Grover 2012-11-07, 06:28
Chunky,
You should have run: msck repair table <Table name>; Sorry, I should have made it clear in my last reply. I have added an entry to Hive wiki for benefit of others: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Recoverpartitions Mark On Tue, Nov 6, 2012 at 9:55 PM, Chunky Gupta <[EMAIL PROTECTED]>wrote: > Hi Mark, > I didn't get any error. > I ran this on hive console:- > "msck table Table_Name;" > It says Ok and showed the execution time as 1.050 sec. > But when I checked partitions for table using > "show partitions Table_Name;" > It didn't show me any partitions. > > Thanks, > Chunky. > > > On Tue, Nov 6, 2012 at 10:38 PM, Mark Grover <[EMAIL PROTECTED]>wrote: > >> Glad to hear, Chunky. >> >> Out of curiosity, what errors did you get when using msck? >> >> >> On Tue, Nov 6, 2012 at 5:14 AM, Chunky Gupta <[EMAIL PROTECTED]>wrote: >> >>> Hi Mark, >>> I tried msck, but it is not working for me. I have written a python >>> script to partition the data individually. >>> >>> Thank you Edward, Mark and Dean. >>> Chunky. >>> >>> >>> On Mon, Nov 5, 2012 at 11:08 PM, Mark Grover < >>> [EMAIL PROTECTED]> wrote: >>> >>>> Chunky, >>>> I have used "recover partitions" command on EMR, and that worked fine. >>>> >>>> However, take a look at https://issues.apache.org/jira/browse/HIVE-874. Seems >>>> like msck command in Apache Hive does the same thing. Try it out and let us >>>> know it goes. >>>> >>>> Mark >>>> >>>> On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo <[EMAIL PROTECTED]>wrote: >>>> >>>>> Recover partitions should work the same way for different file systems. >>>>> >>>>> Edward >>>>> >>>>> On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler >>>>> <[EMAIL PROTECTED]> wrote: >>>>> > Writing a script to add the external partitions individually is the >>>>> only way >>>>> > I know of. >>>>> > >>>>> > Sent from my rotary phone. >>>>> > >>>>> > >>>>> > On Nov 5, 2012, at 8:19 AM, Chunky Gupta <[EMAIL PROTECTED]> >>>>> wrote: >>>>> > >>>>> > Hi Dean, >>>>> > >>>>> > Actually I was having Hadoop and Hive cluster on EMR and I have S3 >>>>> storage >>>>> > containing logs which updates daily and having partition with >>>>> date(dt). And >>>>> > I was using this recover partition. >>>>> > Now I wanted to shift to EC2 and have my own Hadoop and Hive >>>>> cluster. So, >>>>> > what is the alternate of using recover partition in this case, if >>>>> you have >>>>> > any idea ? >>>>> > I found one way of individually partitioning all dates, so I have to >>>>> write >>>>> > script for that to do so for all dates. Is there any easiest way >>>>> other than >>>>> > this ? >>>>> > >>>>> > Thanks, >>>>> > Chunky >>>>> > >>>>> > >>>>> > >>>>> > On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler >>>>> > <[EMAIL PROTECTED]> wrote: >>>>> >> >>>>> >> The RECOVER PARTITIONS is an enhancement added by Amazon to their >>>>> version >>>>> >> of Hive. >>>>> >> >>>>> >> >>>>> >> >>>>> http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html >>>>> >> >>>>> >> <shameless-plus> >>>>> >> Chapter 21 of Programming Hive discusses this feature and other >>>>> aspects >>>>> >> of using Hive in EMR. >>>>> >> </shameless-plug> >>>>> >> >>>>> >> dean >>>>> >> >>>>> >> >>>>> >> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta < >>>>> [EMAIL PROTECTED]> >>>>> >> wrote: >>>>> >>> >>>>> >>> Hi, >>>>> >>> >>>>> >>> I am having a cluster setup on EC2 with Hadoop version 0.20.2 and >>>>> Hive >>>>> >>> version 0.8.1 (I configured everything) . I have created a table >>>>> using :- >>>>> >>> >>>>> >>> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT >>>>> >>> DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION >>>>> 's3://my-location/data/'; >>>>> >>> >>>>> >>> Now I am trying to recover partition using :- >>>>> >>> >>>>> >>> ALTER TABLE XXX RECOVER PARTITIONS; >>>>> >>> >>>>> >>> but I am getting this error :- "FAILED: Parse Error: line 1:12 +
Mark Grover 2012-11-07, 06:28
-
Re: Alter table is giving errorChunky Gupta 2012-11-07, 06:33
Hi Mark,
Sorry, I forgot to mention. I have also tried msck repair table <Table name>; and same output I got which I got from msck only. Do I need to do any other settings for this to work, because I have prepared Hadoop and Hive setup from start on EC2. Thanks, Chunky. On Wed, Nov 7, 2012 at 11:58 AM, Mark Grover <[EMAIL PROTECTED]>wrote: > Chunky, > You should have run: > msck repair table <Table name>; > > Sorry, I should have made it clear in my last reply. I have added an entry > to Hive wiki for benefit of others: > > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Recoverpartitions > > Mark > > > On Tue, Nov 6, 2012 at 9:55 PM, Chunky Gupta <[EMAIL PROTECTED]>wrote: > >> Hi Mark, >> I didn't get any error. >> I ran this on hive console:- >> "msck table Table_Name;" >> It says Ok and showed the execution time as 1.050 sec. >> But when I checked partitions for table using >> "show partitions Table_Name;" >> It didn't show me any partitions. >> >> Thanks, >> Chunky. >> >> >> On Tue, Nov 6, 2012 at 10:38 PM, Mark Grover <[EMAIL PROTECTED] >> > wrote: >> >>> Glad to hear, Chunky. >>> >>> Out of curiosity, what errors did you get when using msck? >>> >>> >>> On Tue, Nov 6, 2012 at 5:14 AM, Chunky Gupta <[EMAIL PROTECTED]>wrote: >>> >>>> Hi Mark, >>>> I tried msck, but it is not working for me. I have written a python >>>> script to partition the data individually. >>>> >>>> Thank you Edward, Mark and Dean. >>>> Chunky. >>>> >>>> >>>> On Mon, Nov 5, 2012 at 11:08 PM, Mark Grover < >>>> [EMAIL PROTECTED]> wrote: >>>> >>>>> Chunky, >>>>> I have used "recover partitions" command on EMR, and that worked fine. >>>>> >>>>> However, take a look at https://issues.apache.org/jira/browse/HIVE-874. Seems >>>>> like msck command in Apache Hive does the same thing. Try it out and let us >>>>> know it goes. >>>>> >>>>> Mark >>>>> >>>>> On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo <[EMAIL PROTECTED] >>>>> > wrote: >>>>> >>>>>> Recover partitions should work the same way for different file >>>>>> systems. >>>>>> >>>>>> Edward >>>>>> >>>>>> On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler >>>>>> <[EMAIL PROTECTED]> wrote: >>>>>> > Writing a script to add the external partitions individually is the >>>>>> only way >>>>>> > I know of. >>>>>> > >>>>>> > Sent from my rotary phone. >>>>>> > >>>>>> > >>>>>> > On Nov 5, 2012, at 8:19 AM, Chunky Gupta <[EMAIL PROTECTED]> >>>>>> wrote: >>>>>> > >>>>>> > Hi Dean, >>>>>> > >>>>>> > Actually I was having Hadoop and Hive cluster on EMR and I have S3 >>>>>> storage >>>>>> > containing logs which updates daily and having partition with >>>>>> date(dt). And >>>>>> > I was using this recover partition. >>>>>> > Now I wanted to shift to EC2 and have my own Hadoop and Hive >>>>>> cluster. So, >>>>>> > what is the alternate of using recover partition in this case, if >>>>>> you have >>>>>> > any idea ? >>>>>> > I found one way of individually partitioning all dates, so I have >>>>>> to write >>>>>> > script for that to do so for all dates. Is there any easiest way >>>>>> other than >>>>>> > this ? >>>>>> > >>>>>> > Thanks, >>>>>> > Chunky >>>>>> > >>>>>> > >>>>>> > >>>>>> > On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler >>>>>> > <[EMAIL PROTECTED]> wrote: >>>>>> >> >>>>>> >> The RECOVER PARTITIONS is an enhancement added by Amazon to their >>>>>> version >>>>>> >> of Hive. >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html >>>>>> >> >>>>>> >> <shameless-plus> >>>>>> >> Chapter 21 of Programming Hive discusses this feature and other >>>>>> aspects >>>>>> >> of using Hive in EMR. >>>>>> >> </shameless-plug> >>>>>> >> >>>>>> >> dean >>>>>> >> >>>>>> >> >>>>>> >> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta < >>>>>> [EMAIL PROTECTED]> >>>>>> >> wrote: >>>>>> > +
Chunky Gupta 2012-11-07, 06:33
-
Re: Alter table is giving errorMark Grover 2012-11-07, 06:52
Chunky,
I just tried it myself. It turns out that the directory you are adding as partition has to be empty for msck repair to work. This is obviously sub-optimal and there is a JIRA in place ( https://issues.apache.org/jira/browse/HIVE-3231) to fix it. So, I'd suggest you keep an eye out for the next version for that fix to come in. In the meanwhile, run msck after you create your partition directory but before you populate your directory with data. Mark On Tue, Nov 6, 2012 at 10:33 PM, Chunky Gupta <[EMAIL PROTECTED]>wrote: > Hi Mark, > Sorry, I forgot to mention. I have also tried > msck repair table <Table name>; > and same output I got which I got from msck only. > Do I need to do any other settings for this to work, because I have > prepared Hadoop and Hive setup from start on EC2. > > Thanks, > Chunky. > > > > On Wed, Nov 7, 2012 at 11:58 AM, Mark Grover <[EMAIL PROTECTED]>wrote: > >> Chunky, >> You should have run: >> msck repair table <Table name>; >> >> Sorry, I should have made it clear in my last reply. I have added an >> entry to Hive wiki for benefit of others: >> >> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Recoverpartitions >> >> Mark >> >> >> On Tue, Nov 6, 2012 at 9:55 PM, Chunky Gupta <[EMAIL PROTECTED]>wrote: >> >>> Hi Mark, >>> I didn't get any error. >>> I ran this on hive console:- >>> "msck table Table_Name;" >>> It says Ok and showed the execution time as 1.050 sec. >>> But when I checked partitions for table using >>> "show partitions Table_Name;" >>> It didn't show me any partitions. >>> >>> Thanks, >>> Chunky. >>> >>> >>> On Tue, Nov 6, 2012 at 10:38 PM, Mark Grover < >>> [EMAIL PROTECTED]> wrote: >>> >>>> Glad to hear, Chunky. >>>> >>>> Out of curiosity, what errors did you get when using msck? >>>> >>>> >>>> On Tue, Nov 6, 2012 at 5:14 AM, Chunky Gupta <[EMAIL PROTECTED]>wrote: >>>> >>>>> Hi Mark, >>>>> I tried msck, but it is not working for me. I have written a python >>>>> script to partition the data individually. >>>>> >>>>> Thank you Edward, Mark and Dean. >>>>> Chunky. >>>>> >>>>> >>>>> On Mon, Nov 5, 2012 at 11:08 PM, Mark Grover < >>>>> [EMAIL PROTECTED]> wrote: >>>>> >>>>>> Chunky, >>>>>> I have used "recover partitions" command on EMR, and that worked fine. >>>>>> >>>>>> However, take a look at >>>>>> https://issues.apache.org/jira/browse/HIVE-874. Seems like msck >>>>>> command in Apache Hive does the same thing. Try it out and let us know it >>>>>> goes. >>>>>> >>>>>> Mark >>>>>> >>>>>> On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo < >>>>>> [EMAIL PROTECTED]> wrote: >>>>>> >>>>>>> Recover partitions should work the same way for different file >>>>>>> systems. >>>>>>> >>>>>>> Edward >>>>>>> >>>>>>> On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler >>>>>>> <[EMAIL PROTECTED]> wrote: >>>>>>> > Writing a script to add the external partitions individually is >>>>>>> the only way >>>>>>> > I know of. >>>>>>> > >>>>>>> > Sent from my rotary phone. >>>>>>> > >>>>>>> > >>>>>>> > On Nov 5, 2012, at 8:19 AM, Chunky Gupta <[EMAIL PROTECTED]> >>>>>>> wrote: >>>>>>> > >>>>>>> > Hi Dean, >>>>>>> > >>>>>>> > Actually I was having Hadoop and Hive cluster on EMR and I have S3 >>>>>>> storage >>>>>>> > containing logs which updates daily and having partition with >>>>>>> date(dt). And >>>>>>> > I was using this recover partition. >>>>>>> > Now I wanted to shift to EC2 and have my own Hadoop and Hive >>>>>>> cluster. So, >>>>>>> > what is the alternate of using recover partition in this case, if >>>>>>> you have >>>>>>> > any idea ? >>>>>>> > I found one way of individually partitioning all dates, so I have >>>>>>> to write >>>>>>> > script for that to do so for all dates. Is there any easiest way >>>>>>> other than >>>>>>> > this ? >>>>>>> > >>>>>>> > Thanks, >>>>>>> > Chunky >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler +
Mark Grover 2012-11-07, 06:52
-
Re: Alter table is giving errorChunky Gupta 2012-11-07, 13:13
Okay Mark, I will be looking into this JIRA regularly.
Thanks again for helping. Chunky. On Wed, Nov 7, 2012 at 12:22 PM, Mark Grover <[EMAIL PROTECTED]>wrote: > Chunky, > I just tried it myself. It turns out that the directory you are adding as > partition has to be empty for msck repair to work. This is obviously > sub-optimal and there is a JIRA in place ( > https://issues.apache.org/jira/browse/HIVE-3231) to fix it. > > So, I'd suggest you keep an eye out for the next version for that fix to > come in. In the meanwhile, run msck after you create your partition > directory but before you populate your directory with data. > > Mark > > > On Tue, Nov 6, 2012 at 10:33 PM, Chunky Gupta <[EMAIL PROTECTED]>wrote: > >> Hi Mark, >> Sorry, I forgot to mention. I have also tried >> msck repair table <Table name>; >> and same output I got which I got from msck only. >> Do I need to do any other settings for this to work, because I have >> prepared Hadoop and Hive setup from start on EC2. >> >> Thanks, >> Chunky. >> >> >> >> On Wed, Nov 7, 2012 at 11:58 AM, Mark Grover <[EMAIL PROTECTED] >> > wrote: >> >>> Chunky, >>> You should have run: >>> msck repair table <Table name>; >>> >>> Sorry, I should have made it clear in my last reply. I have added an >>> entry to Hive wiki for benefit of others: >>> >>> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Recoverpartitions >>> >>> Mark >>> >>> >>> On Tue, Nov 6, 2012 at 9:55 PM, Chunky Gupta <[EMAIL PROTECTED]>wrote: >>> >>>> Hi Mark, >>>> I didn't get any error. >>>> I ran this on hive console:- >>>> "msck table Table_Name;" >>>> It says Ok and showed the execution time as 1.050 sec. >>>> But when I checked partitions for table using >>>> "show partitions Table_Name;" >>>> It didn't show me any partitions. >>>> >>>> Thanks, >>>> Chunky. >>>> >>>> >>>> On Tue, Nov 6, 2012 at 10:38 PM, Mark Grover < >>>> [EMAIL PROTECTED]> wrote: >>>> >>>>> Glad to hear, Chunky. >>>>> >>>>> Out of curiosity, what errors did you get when using msck? >>>>> >>>>> >>>>> On Tue, Nov 6, 2012 at 5:14 AM, Chunky Gupta <[EMAIL PROTECTED]>wrote: >>>>> >>>>>> Hi Mark, >>>>>> I tried msck, but it is not working for me. I have written a python >>>>>> script to partition the data individually. >>>>>> >>>>>> Thank you Edward, Mark and Dean. >>>>>> Chunky. >>>>>> >>>>>> >>>>>> On Mon, Nov 5, 2012 at 11:08 PM, Mark Grover < >>>>>> [EMAIL PROTECTED]> wrote: >>>>>> >>>>>>> Chunky, >>>>>>> I have used "recover partitions" command on EMR, and that worked >>>>>>> fine. >>>>>>> >>>>>>> However, take a look at >>>>>>> https://issues.apache.org/jira/browse/HIVE-874. Seems like msck >>>>>>> command in Apache Hive does the same thing. Try it out and let us know it >>>>>>> goes. >>>>>>> >>>>>>> Mark >>>>>>> >>>>>>> On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo < >>>>>>> [EMAIL PROTECTED]> wrote: >>>>>>> >>>>>>>> Recover partitions should work the same way for different file >>>>>>>> systems. >>>>>>>> >>>>>>>> Edward >>>>>>>> >>>>>>>> On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler >>>>>>>> <[EMAIL PROTECTED]> wrote: >>>>>>>> > Writing a script to add the external partitions individually is >>>>>>>> the only way >>>>>>>> > I know of. >>>>>>>> > >>>>>>>> > Sent from my rotary phone. >>>>>>>> > >>>>>>>> > >>>>>>>> > On Nov 5, 2012, at 8:19 AM, Chunky Gupta <[EMAIL PROTECTED]> >>>>>>>> wrote: >>>>>>>> > >>>>>>>> > Hi Dean, >>>>>>>> > >>>>>>>> > Actually I was having Hadoop and Hive cluster on EMR and I have >>>>>>>> S3 storage >>>>>>>> > containing logs which updates daily and having partition with >>>>>>>> date(dt). And >>>>>>>> > I was using this recover partition. >>>>>>>> > Now I wanted to shift to EC2 and have my own Hadoop and Hive >>>>>>>> cluster. So, >>>>>>>> > what is the alternate of using recover partition in this case, if >>>>>>>> you have >>>>>>>> > any idea ? >>>>>>>> > I found one way of individually partitioning all dates, so I have +
Chunky Gupta 2012-11-07, 13:13
|