Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Dynamic columns in Hive Table - Best Design for the problem


Copy link to this message
-
Dynamic columns in Hive Table - Best Design for the problem
Dear All Hive Group Members,

I have the following requirement.

Input:

Ticket#|Date of booking|Price
100|20-Oct-13|54

100|21-Oct-13|56
100|22-Oct-13|54
100|23-Oct-13|55
100|27-Oct-13|60
100|30-Oct-13|47

101|10-Sep-13|12
101|13-Sep-13|14
101|20-Oct-13|6
Expected Output:

Ticket#|Initial|Delta1|Delta2|Delta3|Delta4|Delta5
100|20-Oct-13,54|21-Oct-13,2|22-Oct-13,0|23-Oct-3,1|27-Oct-13,6|30-Oct-13,-7
101|10-Sep-13,12|13-Sep-13,2|20-Oct-13,-6|||

The number of columns in the expected output is a dynamic list depending on the number of price changes of a ticket.

1) What is the best design to solve the above problem in Hive? 
2) How do we implement it?

Please advise.

Regards,
Raj