Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Partitioning External table


Copy link to this message
-
Re: Partitioning External table
Can you try using:
location 'dt=1/engine'

Cheers

On Wed, Dec 29, 2010 at 1:12 AM, David Ginzburg <[EMAIL PROTECTED]> wrote:

>  Hi,
> Thank you for the reply.
> I tried  ALTER TABLE tpartitions ADD PARTITION (dt='1') LOCATION
> '/user/training/partitions/';
> SHOW PARTITIONS
> tpartitions;
>
> OK
> dt=1
>
>
>  but when I try to issue a select query , I get the following error:
>
> *hive>select count(value) from tpartitions where dt='1';
> Total MapReduce jobs = 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
>   set mapred.reduce.tasks=<number>
> Job Submission failed with exception 'java.io.FileNotFoundException(File
> does not exist: hdfs://localhost:8022/user/training/partitions/dt=1/data)'
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.ExecDriver
>
> *Why is it looking for data file when my sequence file is located at
> /user/training/partitions/dt=1/engine, according to the partition
>
>
>
>
>
>
> > Date: Tue, 28 Dec 2010 11:25:50 -0500
> > Subject: Re: Partitioning External table
> > From: [EMAIL PROTECTED]
> > To: [EMAIL PROTECTED]
>
> >
> > On Tue, Dec 28, 2010 at 9:41 AM, David Ginzburg <[EMAIL PROTECTED]>
> wrote:
> > > Hi,
> > > I am trying to test  creation of  an external table using partitions,
> > > my files on hdfs are:
> > >
> > > /user/training/partitions/dt=2/engine
> > > /user/training/partitions/dt=2/engine
> > >
> > > engine are sequence files which I have managed to create externally
> and
> > > query from, when I have not used partitions.
> > >
> > > When I create with partitions using :
> > > hive> CREATE EXTERNAL TABLE tpartitions(value STRING) PARTITIONED BY
> (dt
> > > STRING) STORED AS INPUTFORMAT
> > > 'org.apache.hadoop.mapred.SequenceFileAsTextInputFormat' OUTPUTFORMAT
> > > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION
> > > '/user/training/partitions';
> > > OK
> > > Time taken: 0.067 seconds
> > >
> > > show partitions
> > > tpartitions;
> > > OK
> > > Time taken: 0.084 seconds
> > > hive> select * from tpartitions;
> > > OK
> > > Time taken: 0.139 seconds
> > >
> > > Can someone point to what am I doing wrong here?
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
> > You need to explicitly add the partitions to the table. The location
> > specified for the partition will be appended to the location of the
> > table.
> >
> > http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Add_Partitions
> >
> > Something like this:
> > alter table tpartitions add partition dt=2 location 'dt=2/engine';
> > alter table tpartitions add partition dt=3 location 'dt=3/engine';
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB