Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Dynamic columns in Hive Table - Best Design for the problem


Copy link to this message
-
Re: Dynamic columns in Hive Table - Best Design for the problem
Matt,

Thanks for the suggestion. Can you please provide more details on what type of UDAF should I develop ? I have never worked on a UDAF earlier. But would like to explore it. Any tips on how to proceed.

Thanks,
Raj

On Saturday, December 28, 2013 2:47 PM, Matt Tucker <[EMAIL PROTECTED]> wrote:
 
It looks like you're essentially doing a pivot function. Your best bet is to write a custom UDAF or look at the windowing functions available in recent releases.
Matt
On Dec 28, 2013 12:57 PM, "Raj Hadoop" <[EMAIL PROTECTED]> wrote:

Dear All Hive Group Members,
>
>
>I have the following requirement.
>
>
>Input:
>
>
>Ticket#|Date of booking|Price
>100|20-Oct-13|54
>
>100|21-Oct-13|56
>100|22-Oct-13|54
>100|23-Oct-13|55
>100|27-Oct-13|60
>100|30-Oct-13|47
>
>
>101|10-Sep-13|12
>101|13-Sep-13|14
>101|20-Oct-13|6
>
>
>
>
>Expected Output:
>
>
>Ticket#|Initial|Delta1|Delta2|Delta3|Delta4|Delta5
>100|20-Oct-13,54|21-Oct-13,2|22-Oct-13,0|23-Oct-3,1|27-Oct-13,6|30-Oct-13,-7
>101|10-Sep-13,12|13-Sep-13,2|20-Oct-13,-6|||
>
>
>The number of columns in the expected output is a dynamic list depending on the number of price changes of a ticket.
>
>
>1) What is the best design to solve the above problem in Hive? 
>2) How do we implement it?
>
>
>Please advise.
>
>
>Regards,
>Raj
>
>
>
>
>
>
>
>
>
>