Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> how to combine some rows into 1 row in hive


Copy link to this message
-
Re: how to combine some rows into 1 row in hive
It's really more of a HiveQL trick than a SQL trick.  In SQL, one would express it as a subquery, which is more straightforward and readable, and has the additional advantage that you could eliminate the aggregation and assume P and N are not true simultaneously, which would nicely produce an error were that assumption ever violated.  HiveQL's lack of subqueries forces this trick to use aggregation to simulate a merge.
 

________________________________
 From: Stephen Sprague <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Sent: Sunday, June 23, 2013 1:20 PM
Subject: Re: how to combine some rows into 1 row in hive
  
so its not a Hive issue at all then instead its clever sql tricks.  good luck maintaining that and hopefully you got some comments in your code for the next sap who comes along and has the privilege of reading  it. :)  

On Sat, Jun 22, 2013 at 5:38 PM, zuohua zhang <[EMAIL PROTECTED]> wrote:

Thanks Michael! That worked without modification!
>
>
>
>On Sat, Jun 22, 2013 at 5:05 PM, Michael Malak <[EMAIL PROTECTED]> wrote:
>
>Or, the single-language (HiveQL) alternative might be (i.e. I haven't tested it):
>> 
>>select f1,
>>       f2,
>>       if(max(if(f3='P',f4,null)) is null,0,max(if(f3='P',f4,null))) pf4,       if(max(if(f3='P',f5,null)) is null,0,max(if(f3='P',f5,null))) pf5,
>>       if(max(if(f3='N',f4,null)) is null,0,max(if(f3='N',f4,null))) nf4,
>>       if(max(if(f3='N',f5,null)) is null,0,max(if(f3='N',f5,null))) nf5
>>from   mytable
>>group by f1, f2;
>> 
>>
>>
>> From: Edward Capriolo <[EMAIL PROTECTED]>
>>To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
>>Sent: Saturday, June 22, 2013 7:48 PM
>>Subject: Re: how to combine some rows into 1 row in hive
>>
>>
>>
>>Using hives streaming feature is a nice option for this as it is a fairly natural way to work with entire rows and return multiple columns
>>
>>select transform a,b,c,d  using /bin/pipeprogram as a, b ,c ,d
>>
>>
>>
>>You an also write a UDTF user defined Table function as well because this can return more then one column.
>>
>>
>>
>>On Sat, Jun 22, 2013 at 6:36 PM, zuohua zhang <[EMAIL PROTECTED]> wrote:
>>
>>I have the following table:
>>>f1 f2 f3 f4 f5
>>>a1 a2 P x1 x2
>>>a1 a2 N x3 x4
>>>a1 a3 N x5 x6
>>>a4 a6 P x7 x8
>>>
>>>
>>>i want to convert to below:
>>>f1 f2 pf4 pf5 nf4 nf5
>>>a1 a2 x1 x2 x3 x4
>>>a1 a3 0 0 x5 x6
>>>a4 a6 x7 x8 0 0
>>>
>>>
>>>basically, when f3="P", I want f4 f5 to be moved to pf4 pf5
>>>when f3="N", i want f4 f5 to be moved to nf4 nf5
>>>when there is no "P" or "N" rows for the record, I want to fill in zeros in the corresponding fields.
>>>
>>>
>>>how to do it?
>>>
>>>
>>>Thanks,
>>>Zuohua
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB