Thank you Selva
for the response. But the data is too huge to be handled by Excel, thus I
need to do it with hive.
On Thu, Oct 10, 2013 at 2:31 PM, Selvamohan Neethiraj <
[EMAIL PROTECTED]> wrote:
> If it is not so much data, you can use Excel's PivotTable to solve this
> specific requirement:
> 1. Select the date and plz columns (w/o header) and create pivotTable on a
> new WorkSheet
> 2. Drag the column name: 'Date' from the 'Pivot Table' Builder to the
> 'Column Label' section.
> 3. Drag the column name: 'PLZ' from the 'Pivot Table' Builder to the
> 'Row Labels'
> 4. Drag the column name: 'Date' from the 'Pivot Table' Builder to the
> 'Values' and ensure the selected summary function is 'Count'
> Hope, this helps,
> On Oct 10, 2013, at 7:53 AM, Panshul Whisper <[EMAIL PROTECTED]>
> I have a data manipulation query.
> I have my data in the following format:
> *Date PLZ Count*
> date1 plz1 count1
> date1 plz1 count2
> date1 plz1 count3
> date1 plz2 count4
> date1 plz2 count5
> date1 plz3 count6
> date1 plz3 count7
> date2 plz1 count8
> date2 plz1 count9
> date2 plz3 count10
> date2 plz3 count11
> date3 plz1 count12
> date3 plz1 count13
> date3 plz2 count14
> date3 plz2 count15
> date3 plz2 count16
> date4 plz1 count17
> date4 plz2 count18
> date4 plz3 count19
> With Hive queries I have managed to get the data into the following
> grouped and aggregated format:
> select plz, TO_DATE(time), sum(totalcount) from power_pad_part where
> yearfolder='2013' and monthfolder in ('01')
> and eco=0 and TO_DATE(time)>='Date1' and TO_DATE(time)<'Date4'
> and plz in ('plz1','plz2','plz3')
> group by plz, time
> *PLZ Date TotalCount*
> plz1 date1 TC1
> plz1 date2 TC2
> plz1 date3 TC3
> plz1 date4 TC4
> plz2 date1 TC5
> plz2 date3 TC6
> plz2 date4 TC7
> plz3 date1 TC8
> plz3 date2 TC9
> plz3 date4 TC10
> The above data is grouped by plz and date. There is no entry of a plz if
> it does not exist for a date.
> I used the following query to generate the above data:
> But I require the data to be in the following format:
> *PLZ Date1 Date2 Date3 Date4 ..... DateN .....*
> plz1 TC1 TC2 TC3 TC4 ..... TC N
> plz2 TC5 ------ TC6 TC7 ..... TC N
> plz3 TC8 TC9 ------ TC10 .... TC N
> This needs to be generated as 1 column for every date of the year.
> Any help to generate this format is welcome. Writing my own mapper and
> reduce and calling in Hive as function is to be my last option.
> Thanking You,
> Ouch Whisper