Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - Best way to create a view with index from explode ?

Viraj Bhat 2010-06-29, 21:05
Paul Yang 2010-06-29, 22:00
Copy link to this message
RE: Best way to create a view with index from explode ?
Viraj Bhat 2010-06-30, 00:05
Hi Paul,

 Thanks for your inputs. I was looking at the Generic explode UDTF in
your piggybank in the contrib. directory.

Here are questions:

1)       Does the explode have a contract that if we take in an array of
maps we return only the maps, not maps and another column (index) with

2)       Second do UDTF's have some restriction that I cannot have a
WHERE clause or select other columns. I want to write something like

select  bcookie, Myexplode(info) as (newcol, index) from myoldtable
where bcookie is not null;

3)       For adding the additional index column, do I need to override
the following method  [[public void process(Object[] o) throws
HiveException ]]

4)       Do you have other examples of writing UDTF's

Any insights would be appreciated.




From: Paul Yang [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, June 29, 2010 3:01 PM
Subject: RE: Best way to create a view with index from explode ?


Yes, the UDTF explode() is probably the easiest way to get multiple rows
from an array. Another way would be to write your own custom script and
use the transform clause. To generate the index of the info, you'll need
the field() UDF, but there is currently an issue with it:





From: Viraj Bhat [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, June 29, 2010 2:06 PM
Subject: Best way to create a view with index from explode ?


Hi all,

I have an input table which has 3 columns one of which is an array list,


bcookie, id, info [Schema: string, string,


Here is a sample row


1245, 1, {[myid#id2, myage#100], [myid#id3, myage#101], [myid#id3,


I wanted to explode the arraylist column and create an additional column
which represents the index of the info


bcookie, id, info, index

1245, 1, [myid#id2, myage#100], 0

1245, 1, [myid#id3, myage#101], 1

1245, 1, [myid#id3, myage#102] , 2


I was thinking of the lines of:

SELECT * FROM srctable LATERAL VIEW explode(info)... ;


Is UDTF the only way to achieve this?

Thanks Viraj


Paul Yang 2010-06-30, 23:38