Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - Best way to create a view with index from explode ?


+
Viraj Bhat 2010-06-29, 21:05
+
Paul Yang 2010-06-29, 22:00
Copy link to this message
-
RE: Best way to create a view with index from explode ?
Viraj Bhat 2010-06-30, 00:05
Hi Paul,

 Thanks for your inputs. I was looking at the Generic explode UDTF in
your piggybank in the contrib. directory.

Here are questions:

1)       Does the explode have a contract that if we take in an array of
maps we return only the maps, not maps and another column (index) with
it.

2)       Second do UDTF's have some restriction that I cannot have a
WHERE clause or select other columns. I want to write something like
this:

select  bcookie, Myexplode(info) as (newcol, index) from myoldtable
where bcookie is not null;

3)       For adding the additional index column, do I need to override
the following method  [[public void process(Object[] o) throws
HiveException ]]

4)       Do you have other examples of writing UDTF's

Any insights would be appreciated.

Viraj

 

________________________________

From: Paul Yang [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, June 29, 2010 3:01 PM
To: [EMAIL PROTECTED]
Subject: RE: Best way to create a view with index from explode ?

 

Yes, the UDTF explode() is probably the easiest way to get multiple rows
from an array. Another way would be to write your own custom script and
use the transform clause. To generate the index of the info, you'll need
the field() UDF, but there is currently an issue with it:

 

https://issues.apache.org/jira/browse/HIVE-1385

 

 

From: Viraj Bhat [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, June 29, 2010 2:06 PM
To: [EMAIL PROTECTED]
Subject: Best way to create a view with index from explode ?

 

Hi all,

I have an input table which has 3 columns one of which is an array list,

 

bcookie, id, info [Schema: string, string,
arraylist<map<string,string>>]

 

Here is a sample row

 

1245, 1, {[myid#id2, myage#100], [myid#id3, myage#101], [myid#id3,
myage#102]}

 

I wanted to explode the arraylist column and create an additional column
which represents the index of the info

 

bcookie, id, info, index

1245, 1, [myid#id2, myage#100], 0

1245, 1, [myid#id3, myage#101], 1

1245, 1, [myid#id3, myage#102] , 2

 

I was thinking of the lines of:

SELECT * FROM srctable LATERAL VIEW explode(info)... ;

 

Is UDTF the only way to achieve this?

Thanks Viraj

 

+
Paul Yang 2010-06-30, 23:38