Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Exploding Array of Struct in Hive


Copy link to this message
-
Re: Exploding Array of Struct in Hive
You basically need to do a nested-explode here.

SELECT user_id, product_id, prod_and_ts_split FROM (
SELECT  * FROM (select user_id, prod_and_ts.product_id as product_id,
prod_and_ts.timestamps as timestamps FROM testingtable2 LATERAL VIEW
explode(purchased_item) exploded_table as prod_and_ts) prod_and_ts
) tt2 LATERAL VIEW explode(split(timestamps, "#")) exploded_table2 as
prod_and_ts_split;

Thanks,
Vijay

On Wed, Jul 18, 2012 at 2:24 PM, Raihan Jamal <[EMAIL PROTECTED]> wrote:
> CREATE TABLE IF NOT EXISTS TestingTable2
>
> (
>
> USER_ID BIGINT,
>
> PURCHASED_ITEM ARRAY<STRUCT<PRODUCT_ID: BIGINT,TIMESTAMPS:STRING>>
>
> ) ROW FORMAT
>
>  DELIMITED FIELDS TERMINATED BY '-'
>
>  collection items terminated by ','
>
>  map keys terminated by ':'
>
>  LINES TERMINATED BY '\n'
>
>  STORED AS TEXTFILE
>
>  LOCATION '/user/rjamal/output2';
>
>
>
> Below is the data in TestingTable2
>
>
>
> 1345653-110909316904:1341894546,221065796761:1341887508
>
>
>
> I can explode the above data by using this below query and it works fine for
> above data-
>
>
>
> SELECT  * FROM (select user_id, prod_and_ts.product_id as product_id,
> prod_and_ts.timestamps as timestamps FROM testingtable2 LATERAL VIEW
> explode(purchased_item) exploded_table as prod_and_ts) prod_and_ts;
>
>
>
> And I will get output like this which is fine-
>
>
>
> 1345653                                110909316904     1341894546
>
> 1345653                                221065796761     1341887508
>
>
>
> But in some cases I have data in the table like this, timestamp appended by
> pound sign for same product_id-
>
>
>
> 1345653-110909316904:1341894546#1341885695,221065796761:1341887508#1341885453
>
>
>
> And I need output like this for above data using the HiveQL query-
>
>
>
> 1345653                                110909316904     1341894546
>
> 1345653                                110909316904    1341885695
>
> 1345653                                221065796761     1341887508
>
> 1345653                                221065796761    1341885453
>
>
>
> Is this possible to do this somehow?
>
>
> Any suggestions will be appreciated.
>
>
>
> Raihan Jamal
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB