Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Array index support non-constant expresssion


+
java8964 java8964 2012-12-11, 22:24
Copy link to this message
-
RE: Array index support non-constant expresssion

OK.
I followed the hive source code of org.apache.hadoop.hive.ql.udf.generic.GenericUDFArrayContains and wrote the UDF. It is quite simple.
It works fine as I expected for simple case, but when I try to run it under some complex query, the hive MR jobs failed with some strange errors. What I mean is that it failed in HIVE code base, from stuck trace, I can not see this failure has anything to do with my custom code.
I would like some help if some one can tell me what went wrong.
For example, I created this UDF called darray, stand for dynamic array, which supports the non-constant value as the index location of the array.
The following query works fine as I expected:
hive> select c_poi.provider_str as provider_str, c_poi.name as name from (select darray(search_results, c.index_loc) as c_poi from search_table lateral view explode(search_clicks) clickTable as c) a limit 5;POI                         xxxxADDRESS               some addressPOI                        xxxxPOI                        xxxxADDRESSS             some address
Of course, in this case, I only want the provider_str = 'POI' returned, and filter out any rows with provider_str != 'POI', so it sounds simple, I changed the query to the following:
hive> select c_poi.provider_str as provider_str, c_poi.name as name from (select darray(search_results, c.rank) as c_poi from search_table lateral view explode(search_clicks) clickTable as c) a where c_poi.provider_str = 'POI' limit 5;Total MapReduce jobs = 1Launching Job 1 out of 1Number of reduce tasks is set to 0 since there's no reduce operatorCannot run job locally: Input Size (= 178314025) is larger than hive.exec.mode.local.auto.inputbytes.max (= 134217728)Starting Job = job_201212031001_0100, Tracking URL = http://blevine-desktop:50030/jobdetails.jsp?jobid=job_201212031001_0100Kill Command = /home/yzhang/hadoop/bin/hadoop job  -Dmapred.job.tracker=blevine-desktop:8021 -kill job_201212031001_01002012-12-12 11:45:24,090 Stage-1 map = 0%,  reduce = 0%2012-12-12 11:45:43,173 Stage-1 map = 100%,  reduce = 100%Ended Job = job_201212031001_0100 with errorsFAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
I am only add a Where limitation, but to my surprise, the MR jobs generated by HIVE failed. I am testing this in my local standalone cluster, which is running CDH3U3 release. When I check the hadoop userlog, here is what I got:
2012-12-12 11:40:22,421 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: SELECT struct<_col0:bigint,_col1:string,_col2:string,_col3:string,_col4:string,_col5:string,_col6:boolean,_col7:boolean,_col8:boolean,_col9:boolean,_col10:boolean,_col11:boolean,_col12:string,_col13:string,_col14:struct<lat:double,lon:double,query_text_raw:string,query_text_normalized:string,query_string:string,llcountry:string,ipcountry:string,request_cnt:int,address:struct<country:string,state:string,zip:string,city:string,street:string,house:string>,categories_id:array<int>,categories_name:array<string>,lang_raw:string,lang_rose:string,lang:string,viewport:struct<top_lat:double,left_lon:double,bottom_lat:double,right_lon:double>>,_col15:struct<versions:int,physical_host:string,nose_request_id:string,client_type:string,ip:int,time_taken:int,user_agent:string,http_host:string,http_referrer:string,http_status:smallint,http_size:int,accept_language:string,md5:string,datacenter:string,tlv_map_data_version:string,tlv_devide_software_version:string,csid:int,rid:string,xncrid:string,cbfn:string,sources:array<struct<tm:bigint,tm_date:string,tm_time:string,md5:string,time_taken:int>>>,_col16:array<struct<provider_str:string,name:string,lat:double,lon:double,dyn:boolean,authoritative:boolean,search_center:boolean>>,_col17:array<struct<rank:int,action:int,tm:bigint,event:string,is_csid:boolean,is_rid:boolean,is_pbapi:boolean,is_nac:boolean>>,_col18:string,_col19:struct<rank:int,action:int,tm:bigint,event:string,is_csid:boolean,is_rid:boolean,is_pbapi:boolean,is_nac:boolean>>2012-12-12 11:40:22,440 WARN org.apache.hadoop.mapred.Child: Error running childjava.lang.RuntimeException: Error in configuring object        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:387)        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)        at java.security.AccessController.doPrivileged(Native Method)        at javax.security.auth.Subject.doAs(Subject.java:396)        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)        at org.apache.hadoop.mapred.Child.main(Child.java:264)Caused by: java.lang.reflect.InvocationTargetException        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)        at java.lang.reflect.Method.invoke(Method.java:597)        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)        ... 9 moreCaused by: java.lang.RuntimeException: Error in configuring object        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)        at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)        ... 14 moreCaused by: java.lang.reflect.InvocationTargetException        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)        at sun.reflect.DelegatingMethodAccessor
+
java8964 java8964 2012-12-12, 18:28
+
Navis류승우 2012-12-13, 00:06
+
java8964 java8964 2012-12-13, 01:43
+
Navis류승우 2012-12-13, 04:46
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB