Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Combine multiple row values based upon a condition.


Copy link to this message
-
Re: Combine multiple row values based upon a condition.
If you really only need to consider adjacent rows, it might just be easier
to write a UDF or use streaming, where your code remembers the last record
seen and emits a new record if you want to do the join with the current
record.

On Sat, Feb 2, 2013 at 1:21 PM, Martijn van Leeuwen <[EMAIL PROTECTED]>wrote:

> Hi all,
>
> I new to Apache Hive and I am doing some test to see if it fits my needs,
> one of the questions I have if it is possible to "peek" for the next row in
> order to find out if the values should be combined. Let me explain by an
> example.
>
> Let say my data looks like this
>
> Id name offset
> 1 Jan 100
> 2 Janssen 104
> 3 Klaas 150
> 4 Jan 160
> 5 Janssen 164
>
> An my output to another table should be this
>
> Id fullname offsets
> 1 Jan Janssen [ 100, 160 ]
>
> I would like to combine the name values from two rows where the offset of
> the two rows are no more then 1 character apart.
>
> Is this type of data manipulation is possible and if it is could someone
> point me to the right direction hopefully with some explaination?
>
> Kind regards
> Martijn
--
*Dean Wampler, Ph.D.*
thinkbiganalytics.com
+1-312-339-1330
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB