Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Partitioning External table


Copy link to this message
-
Re: Partitioning External table
Ted Yu 2010-12-29, 14:56
Can you try using:
location 'dt=1/engine'

Cheers

On Wed, Dec 29, 2010 at 1:12 AM, David Ginzburg <[EMAIL PROTECTED]> wrote:

>  Hi,
> Thank you for the reply.
> I tried  ALTER TABLE tpartitions ADD PARTITION (dt='1') LOCATION
> '/user/training/partitions/';
> SHOW PARTITIONS
> tpartitions;
>
> OK
> dt=1
>
>
>  but when I try to issue a select query , I get the following error:
>
> *hive>select count(value) from tpartitions where dt='1';
> Total MapReduce jobs = 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
>   set mapred.reduce.tasks=<number>
> Job Submission failed with exception 'java.io.FileNotFoundException(File
> does not exist: hdfs://localhost:8022/user/training/partitions/dt=1/data)'
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.ExecDriver
>
> *Why is it looking for data file when my sequence file is located at
> /user/training/partitions/dt=1/engine, according to the partition
>
>
>
>
>
>
> > Date: Tue, 28 Dec 2010 11:25:50 -0500
> > Subject: Re: Partitioning External table
> > From: [EMAIL PROTECTED]
> > To: [EMAIL PROTECTED]
>
> >
> > On Tue, Dec 28, 2010 at 9:41 AM, David Ginzburg <[EMAIL PROTECTED]>
> wrote:
> > > Hi,
> > > I am trying to test  creation of  an external table using partitions,
> > > my files on hdfs are:
> > >
> > > /user/training/partitions/dt=2/engine
> > > /user/training/partitions/dt=2/engine
> > >
> > > engine are sequence files which I have managed to create externally
> and
> > > query from, when I have not used partitions.
> > >
> > > When I create with partitions using :
> > > hive> CREATE EXTERNAL TABLE tpartitions(value STRING) PARTITIONED BY
> (dt
> > > STRING) STORED AS INPUTFORMAT
> > > 'org.apache.hadoop.mapred.SequenceFileAsTextInputFormat' OUTPUTFORMAT
> > > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION
> > > '/user/training/partitions';
> > > OK
> > > Time taken: 0.067 seconds
> > >
> > > show partitions
> > > tpartitions;
> > > OK
> > > Time taken: 0.084 seconds
> > > hive> select * from tpartitions;
> > > OK
> > > Time taken: 0.139 seconds
> > >
> > > Can someone point to what am I doing wrong here?
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
> > You need to explicitly add the partitions to the table. The location
> > specified for the partition will be appended to the location of the
> > table.
> >
> > http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Add_Partitions
> >
> > Something like this:
> > alter table tpartitions add partition dt=2 location 'dt=2/engine';
> > alter table tpartitions add partition dt=3 location 'dt=3/engine';
>