Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Hive Query having virtual column INPUT__FILE__NAME in where clause gives exception


Copy link to this message
-
Re: Hive Query having virtual column INPUT__FILE__NAME in where clause gives exception
Jitendra,
I am really not sure you can use virtual columns in where clause.  (I never
tried it so I may be wrong as well).

can you try executing your query as below

select count(*), filename from (select INPUT__FILE__NAME as filename from
netflow)tmp  where filename='vzb.1351794600.0';

please check for query syntax. I am giving an idea and have not verified
the query
On Fri, Jun 14, 2013 at 4:57 PM, Jitendra Kumar Singh <
[EMAIL PROTECTED]> wrote:

> Hi Guys,
>
> Executing hive query with filter on virtual column INPUT_*FILE*_NAME
> result in following exception.
>
> hive> select count(*) from netflow where INPUT__FILE__NAME='vzb.
> 1351794600.0';
>
> FAILED: SemanticException java.lang.RuntimeException: cannot find field
> input__file__name from
> [org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@1d264bf5,
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@3d44d0c6
> ,
>
> .
>
> .
>
> .
>
>
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@7e6bc5aa
> ]
>
> This error is different from the one we get when column name is wrong
>
> hive> select count(*) from netflow where INPUT__FILE__NAM='vzb.
> 1351794600.0';
>
> FAILED: SemanticException [Error 10004]: Line 1:35 Invalid table alias or
> column reference 'INPUT__FILE__NAM': (possible column names are: first,
> last, ....)
>
> But using this virtual column in select clause works fine.
>
> hive> select INPUT__FILE__NAME from netflow group by INPUT__FILE__NAME;
>
> Total MapReduce jobs = 1
>
> Launching Job 1 out of 1
>
> Number of reduce tasks not specified. Estimated from input data size: 4
>
> In order to change the average load for a reducer (in bytes):
>
>   set hive.exec.reducers.bytes.per.reducer=<number>
>
> In order to limit the maximum number of reducers:
>
>   set hive.exec.reducers.max=<number>
>
> In order to set a constant number of reducers:
>
>   set mapred.reduce.tasks=<number>
>
> Starting Job = job_201306041359_0006, Tracking URL > http://192.168.0.224:50030/jobdetails.jsp?jobid=job_201306041359_0006
>
> Kill Command = /opt/hadoop/bin/../bin/hadoop job  -kill
> job_201306041359_0006
>
> Hadoop job information for Stage-1: number of mappers: 12; number of
> reducers: 4
>
> 2013-06-14 18:20:10,265 Stage-1 map = 0%,  reduce = 0%
>
> 2013-06-14 18:20:33,363 Stage-1 map = 8%,  reduce = 0%
>
> .
>
> .
>
> .
>
> 2013-06-14 18:21:15,554 Stage-1 map = 100%,  reduce = 100%
>
> Ended Job = job_201306041359_0006
>
> MapReduce Jobs Launched:
>
> Job 0: Map: 12  Reduce: 4   HDFS Read: 3107826046 HDFS Write: 55 SUCCESS
>
> Total MapReduce CPU Time Spent: 0 msec
>
> OK
>
> hdfs://192.168.0.224:9000/data/jk/vzb/vzb.1351794600.0
>
> Time taken: 78.467 seconds
>
> I am trying to create external hive table on already present HDFS data.
> And I have extra files in the folder that I want to ignore. Similar to what
> is asked and suggested in following stackflow questions how to make hive
> take only specific files as input from hdfs folder<http://stackoverflow.com/questions/16844758/how-to-make-hive-take-only-specific-files-as-input-from-hdfs-folder> when
> creating an external table in hive can I point the location to specific
> files in a direcotry?<http://stackoverflow.com/questions/11269203/when-creating-an-external-table-in-hive-can-i-point-the-location-to-specific-fil>
>
> Any help would be appreciated. Full stack trace I am getting is as follows
>
> 2013-06-14 15:01:32,608 ERROR ql.Driver
> (SessionState.java:printError(401)) - FAILED: SemanticException
> java.lang.RuntimeException: cannot find field input__
>
> org.apache.hadoop.hive.ql.parse.SemanticException:
> java.lang.RuntimeException: cannot find field input__file__name from
> [org.apache.hadoop.hive.serde2.object
>
>         at
> org.apache.hadoop.hive.ql.optimizer.pcr.PcrOpProcFactory$FilterPCR.process(PcrOpProcFactory.java:122)
>
>         at
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)

Nitin Pawar