Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Dynamic columns in Hive Table - Best Design for the problem


Copy link to this message
-
Dynamic columns in Hive Table - Best Design for the problem
Dear All Hive Group Members,

I have the following requirement.

Input:

Ticket#|Date of booking|Price
100|20-Oct-13|54

100|21-Oct-13|56
100|22-Oct-13|54
100|23-Oct-13|55
100|27-Oct-13|60
100|30-Oct-13|47

101|10-Sep-13|12
101|13-Sep-13|14
101|20-Oct-13|6
Expected Output:

Ticket#|Initial|Delta1|Delta2|Delta3|Delta4|Delta5
100|20-Oct-13,54|21-Oct-13,2|22-Oct-13,0|23-Oct-3,1|27-Oct-13,6|30-Oct-13,-7
101|10-Sep-13,12|13-Sep-13,2|20-Oct-13,-6|||

The number of columns in the expected output is a dynamic list depending on the number of price changes of a ticket.

1) What is the best design to solve the above problem in Hive? 
2) How do we implement it?

Please advise.

Regards,
Raj
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB