Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Combine multiple row values based upon a condition.

Copy link to this message
Re: Combine multiple row values based upon a condition.
If you really only need to consider adjacent rows, it might just be easier
to write a UDF or use streaming, where your code remembers the last record
seen and emits a new record if you want to do the join with the current

On Sat, Feb 2, 2013 at 1:21 PM, Martijn van Leeuwen <[EMAIL PROTECTED]>wrote:

> Hi all,
> I new to Apache Hive and I am doing some test to see if it fits my needs,
> one of the questions I have if it is possible to "peek" for the next row in
> order to find out if the values should be combined. Let me explain by an
> example.
> Let say my data looks like this
> Id name offset
> 1 Jan 100
> 2 Janssen 104
> 3 Klaas 150
> 4 Jan 160
> 5 Janssen 164
> An my output to another table should be this
> Id fullname offsets
> 1 Jan Janssen [ 100, 160 ]
> I would like to combine the name values from two rows where the offset of
> the two rows are no more then 1 character apart.
> Is this type of data manipulation is possible and if it is could someone
> point me to the right direction hopefully with some explaination?
> Kind regards
> Martijn
*Dean Wampler, Ph.D.*