Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Array index support non-constant expresssion


Copy link to this message
-
Re: Array index support non-constant expresssion
Navis류승우 2012-12-13, 00:06
Could you try it with CP/PPD disabled?

set hive.optimize.cp=false;
set hive.optimize.ppd=false;

2012/12/13 java8964 java8964 <[EMAIL PROTECTED]>:
> Hi,
>
> I played my query further, and found out it is very puzzle to explain the
> following behaviors:
>
> 1) The following query works:
>
> select c_poi.provider_str, c_poi.name from (select darray(search_results,
> c.rank) as c_poi from nulf_search lateral view explode(search_clicks)
> clickTable as c) a
>
> I get get all the result from the above query without any problem.
>
> 2) The following query NOT works:
>
> select c_poi.provider_str, c_poi.name from (select darray(search_results,
> c.rank) as c_poi from nulf_search lateral view explode(search_clicks)
> clickTable as c) a where c_poi.provider_str = 'POI'
>
> As long as I add the where criteria on provider_str, or even I added another
> level of sub query like following:
>
> select
> ps, name
> from
> (select c_poi.provider_str as ps, c_poi.name as name from (select
> darray(search_results, c.rank) as c_poi from nulf_search lateral view
> explode(search_clicks) clickTable as c) a ) b
> where ps = 'POI'
>
> any kind of criteria I tried to add on provider_str, the hive MR jobs failed
> in the same error I shown below.
>
> Any idea why this happened? Is it related to the data? But provider_str is
> just a simple String type.
>
> Thanks
>
> Yong
>
> ________________________________
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: RE: Array index support non-constant expresssion
> Date: Wed, 12 Dec 2012 12:15:27 -0500
>
>
> OK.
>
> I followed the hive source code of
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFArrayContains and wrote the
> UDF. It is quite simple.
>
> It works fine as I expected for simple case, but when I try to run it under
> some complex query, the hive MR jobs failed with some strange errors. What I
> mean is that it failed in HIVE code base, from stuck trace, I can not see
> this failure has anything to do with my custom code.
>
> I would like some help if some one can tell me what went wrong.
>
> For example, I created this UDF called darray, stand for dynamic array,
> which supports the non-constant value as the index location of the array.
>
> The following query works fine as I expected:
>
> hive> select c_poi.provider_str as provider_str, c_poi.name as name from
> (select darray(search_results, c.index_loc) as c_poi from search_table
> lateral view explode(search_clicks) clickTable as c) a limit 5;
> POI                         xxxx
> ADDRESS               some address
> POI                        xxxx
> POI                        xxxx
> ADDRESSS             some address
>
> Of course, in this case, I only want the provider_str = 'POI' returned, and
> filter out any rows with provider_str != 'POI', so it sounds simple, I
> changed the query to the following:
>
> hive> select c_poi.provider_str as provider_str, c_poi.name as name from
> (select darray(search_results, c.rank) as c_poi from search_table lateral
> view explode(search_clicks) clickTable as c) a where c_poi.provider_str > 'POI' limit 5;
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> Cannot run job locally: Input Size (= 178314025) is larger than
> hive.exec.mode.local.auto.inputbytes.max (= 134217728)
> Starting Job = job_201212031001_0100, Tracking URL > http://blevine-desktop:50030/jobdetails.jsp?jobid=job_201212031001_0100
> Kill Command = /home/yzhang/hadoop/bin/hadoop job
> -Dmapred.job.tracker=blevine-desktop:8021 -kill job_201212031001_0100
> 2012-12-12 11:45:24,090 Stage-1 map = 0%,  reduce = 0%
> 2012-12-12 11:45:43,173 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_201212031001_0100 with errors
> FAILED: Execution Error, return code 2 from
> org.apache.hadoop.hive.ql.exec.MapRedTask
>
> I am only add a Where limitation, but to my surprise, the MR jobs generated
> by HIVE failed. I am testing this in my local standalone cluster, which is