|
|
-
Converting rows into dynamic colums in Hive
richin.jain@... 2012-08-07, 20:57
Hi All,
One of my Query output looks like-
Alpha Beta Gamma 123 xyz 1.0 123 abc 0.5 123 pqr 1.3 123 def 2.1 456 xyz 0.1 456 abc 0.6 456 pqr 1.9 456 def 3.2 456 lmn 1.1 456 sdf 1.2
I want the output for the data visualization purpose to look like (basically taking rows from one table and making it column in another table)
Alpha xyz abc pqr def lmn sdf 123 1.0 0.5 1.3 2.1 456 0.1 0.6 1.9 3.2 1.1 1.2
Question - Can it be done in Hive? If not, any suggestions.
Thanks, Richin
+
richin.jain@... 2012-08-07, 20:57
-
Re: Converting rows into dynamic colums in Hive
Ashish Thusoo 2012-08-07, 21:04
you should be able to do this in hive using a group by on alpha and then using a combination of the max and if statement... something on the following lines
select alpha, max(abc), max(pqr), ... ( select alpha, if (beta == 'abc', Gamma, NULL) as abc, if (beta == 'pqr', Gamma, NUL) as pqr, .... from table ) group by alpha
something on those lines...
Ashish
On Tue, Aug 7, 2012 at 1:57 PM, <[EMAIL PROTECTED]> wrote:
> Hi All,**** > > ** ** > > One of my Query output looks like-**** > > ** ** > > *Alpha Beta Gamma* > > 123 xyz 1.0**** > > 123 abc 0.5**** > > 123 pqr 1.3**** > > 123 def 2.1**** > > *456 xyz 0.1* > > *456 abc 0.6* > > *456 pqr 1.9* > > *456 def 3.2* > > *456 lmn 1.1* > > *456 sdf 1.2* > > ** ** > > I want the output for the data visualization purpose to look like > (basically taking rows from one table and making it column in another table) > **** > > ** ** > > *Alpha xyz abc pqr def > lmn sdf * > > 123 1.0 0.5 1.3 > 2.1 **** > > 456 0.1 0.6 1.9 > 3.2 1.1 1.2**** > > ** ** > > Question – Can it be done in Hive? If not, any suggestions.**** > > ** ** > > Thanks,**** > > Richin**** > > ** ** > > ** ** >
+
Ashish Thusoo 2012-08-07, 21:04
-
RE: Converting rows into dynamic colums in Hive
richin.jain@... 2012-08-08, 19:56
Thanks Ashish, that gives an idea. But I am not sure about the outer select loop, I have to know all the values in Beta column beforehand to do a max on each value. Is there a better way?
Richin
From: ext Ashish Thusoo [mailto:[EMAIL PROTECTED]] Sent: Tuesday, August 07, 2012 5:05 PM To: [EMAIL PROTECTED] Subject: Re: Converting rows into dynamic colums in Hive
you should be able to do this in hive using a group by on alpha and then using a combination of the max and if statement... something on the following lines
select alpha, max(abc), max(pqr), ... ( select alpha, if (beta == 'abc', Gamma, NULL) as abc, if (beta == 'pqr', Gamma, NUL) as pqr, .... from table ) group by alpha
something on those lines...
Ashish On Tue, Aug 7, 2012 at 1:57 PM, <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Hi All,
One of my Query output looks like-
Alpha Beta Gamma 123 xyz 1.0 123 abc 0.5 123 pqr 1.3 123 def 2.1 456 xyz 0.1 456 abc 0.6 456 pqr 1.9 456 def 3.2 456 lmn 1.1 456 sdf 1.2
I want the output for the data visualization purpose to look like (basically taking rows from one table and making it column in another table)
Alpha xyz abc pqr def lmn sdf 123 1.0 0.5 1.3 2.1 456 0.1 0.6 1.9 3.2 1.1 1.2
Question - Can it be done in Hive? If not, any suggestions.
Thanks, Richin
+
richin.jain@... 2012-08-08, 19:56
-
Re: Converting rows into dynamic colums in Hive
John Meagher 2012-08-08, 20:34
I don't think having dynamic columns is possible in Hive. I've always output from Hive a structure like your query output and used R to convert it into a dynamic column structure. On Wed, Aug 8, 2012 at 3:56 PM, <[EMAIL PROTECTED]> wrote: > Thanks Ashish, that gives an idea. > > But I am not sure about the outer select loop, I have to know all the values > in Beta column beforehand to do a max on each value. > > Is there a better way? > > > > Richin > > > > From: ext Ashish Thusoo [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, August 07, 2012 5:05 PM > To: [EMAIL PROTECTED] > Subject: Re: Converting rows into dynamic colums in Hive > > > > you should be able to do this in hive using a group by on alpha and then > using a combination of the max and if statement... something on the > following lines > > select alpha, max(abc), max(pqr), ... > ( > select alpha, if (beta == 'abc', Gamma, NULL) as abc, if (beta == 'pqr', > Gamma, NUL) as pqr, .... > from table > ) > group by alpha > > something on those lines... > > Ashish > > On Tue, Aug 7, 2012 at 1:57 PM, <[EMAIL PROTECTED]> wrote: > > Hi All, > > > > One of my Query output looks like- > > > > Alpha Beta Gamma > > 123 xyz 1.0 > > 123 abc 0.5 > > 123 pqr 1.3 > > 123 def 2.1 > > 456 xyz 0.1 > > 456 abc 0.6 > > 456 pqr 1.9 > > 456 def 3.2 > > 456 lmn 1.1 > > 456 sdf 1.2 > > > > I want the output for the data visualization purpose to look like (basically > taking rows from one table and making it column in another table) > > > > Alpha xyz abc pqr def lmn > sdf > > 123 1.0 0.5 1.3 2.1 > > 456 0.1 0.6 1.9 3.2 > 1.1 1.2 > > > > Question – Can it be done in Hive? If not, any suggestions. > > > > Thanks, > > Richin > > > > > >
+
John Meagher 2012-08-08, 20:34
-
RE: Converting rows into dynamic colums in Hive
richin.jain@... 2012-08-09, 00:58
John,
What is R?
-----Original Message----- From: ext John Meagher [mailto:[EMAIL PROTECTED]] Sent: Wednesday, August 08, 2012 4:34 PM To: [EMAIL PROTECTED] Subject: Re: Converting rows into dynamic colums in Hive
I don't think having dynamic columns is possible in Hive. I've always output from Hive a structure like your query output and used R to convert it into a dynamic column structure. On Wed, Aug 8, 2012 at 3:56 PM, <[EMAIL PROTECTED]> wrote: > Thanks Ashish, that gives an idea. > > But I am not sure about the outer select loop, I have to know all the > values in Beta column beforehand to do a max on each value. > > Is there a better way? > > > > Richin > > > > From: ext Ashish Thusoo [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, August 07, 2012 5:05 PM > To: [EMAIL PROTECTED] > Subject: Re: Converting rows into dynamic colums in Hive > > > > you should be able to do this in hive using a group by on alpha and > then using a combination of the max and if statement... something on > the following lines > > select alpha, max(abc), max(pqr), ... > ( > select alpha, if (beta == 'abc', Gamma, NULL) as abc, if (beta == > 'pqr', Gamma, NUL) as pqr, .... > from table > ) > group by alpha > > something on those lines... > > Ashish > > On Tue, Aug 7, 2012 at 1:57 PM, <[EMAIL PROTECTED]> wrote: > > Hi All, > > > > One of my Query output looks like- > > > > Alpha Beta Gamma > > 123 xyz 1.0 > > 123 abc 0.5 > > 123 pqr 1.3 > > 123 def 2.1 > > 456 xyz 0.1 > > 456 abc 0.6 > > 456 pqr 1.9 > > 456 def 3.2 > > 456 lmn 1.1 > > 456 sdf 1.2 > > > > I want the output for the data visualization purpose to look like > (basically taking rows from one table and making it column in another > table) > > > > Alpha xyz abc pqr def lmn > sdf > > 123 1.0 0.5 1.3 2.1 > > 456 0.1 0.6 1.9 3.2 > 1.1 1.2 > > > > Question - Can it be done in Hive? If not, any suggestions. > > > > Thanks, > > Richin > > > > > >
+
richin.jain@... 2012-08-09, 00:58
-
Re: Converting rows into dynamic colums in Hive
John Meagher 2012-08-09, 14:10
R is another programming language geared to statistical analysis. It has some features that make it easy to write this sort of data manipulation. http://www.r-project.org/Here's a sample R function that does something similar to what you need (note: I'm not a real R programmer, so this might be ugly, but it works): abCountsToMatrix <- function(csvFilename, outputmatrixfile){ data <- read.csv(csvFilename, as.is=T, check.names=F, header=F) cat("Loaded", nrow(data), "rows.\n") idrows <- sort(unique(c(data[ , 1]))) idcols <- sort(unique(c(data[ , 2]))) cat("Found", length(idrows), " X ", length(idcols), ".") output <- array(0, c(length(idrows), length(idcols))) rownames(output) <- idrows colnames(output) <- idcols for(i in 1:nrow(data)){ if (i %% 100 == 0) { cat("On ", i, " of ", nrow(data), "\n") } rowidx <- which(idrows == data[i, 1]) colidx <- which(idcols == data[i, 2]) count <- data[i,3] output[rowidx, colidx] <- count } write.csv(output, file=outputmatrixfile, quote=F) } On Wed, Aug 8, 2012 at 8:58 PM, <[EMAIL PROTECTED]> wrote: > John, > > What is R? > > -----Original Message----- > From: ext John Meagher [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, August 08, 2012 4:34 PM > To: [EMAIL PROTECTED] > Subject: Re: Converting rows into dynamic colums in Hive > > I don't think having dynamic columns is possible in Hive. I've always output from Hive a structure like your query output and used R to convert it into a dynamic column structure. > > > On Wed, Aug 8, 2012 at 3:56 PM, <[EMAIL PROTECTED]> wrote: >> Thanks Ashish, that gives an idea. >> >> But I am not sure about the outer select loop, I have to know all the >> values in Beta column beforehand to do a max on each value. >> >> Is there a better way? >> >> >> >> Richin >> >> >> >> From: ext Ashish Thusoo [mailto:[EMAIL PROTECTED]] >> Sent: Tuesday, August 07, 2012 5:05 PM >> To: [EMAIL PROTECTED] >> Subject: Re: Converting rows into dynamic colums in Hive >> >> >> >> you should be able to do this in hive using a group by on alpha and >> then using a combination of the max and if statement... something on >> the following lines >> >> select alpha, max(abc), max(pqr), ... >> ( >> select alpha, if (beta == 'abc', Gamma, NULL) as abc, if (beta =>> 'pqr', Gamma, NUL) as pqr, .... >> from table >> ) >> group by alpha >> >> something on those lines... >> >> Ashish >> >> On Tue, Aug 7, 2012 at 1:57 PM, <[EMAIL PROTECTED]> wrote: >> >> Hi All, >> >> >> >> One of my Query output looks like- >> >> >> >> Alpha Beta Gamma >> >> 123 xyz 1.0 >> >> 123 abc 0.5 >> >> 123 pqr 1.3 >> >> 123 def 2.1 >> >> 456 xyz 0.1 >> >> 456 abc 0.6 >> >> 456 pqr 1.9 >> >> 456 def 3.2 >> >> 456 lmn 1.1 >> >> 456 sdf 1.2 >> >> >> >> I want the output for the data visualization purpose to look like >> (basically taking rows from one table and making it column in another >> table) >> >> >> >> Alpha xyz abc pqr def lmn >> sdf >> >> 123 1.0 0.5 1.3 2.1 >> >> 456 0.1 0.6 1.9 3.2 >> 1.1 1.2 >> >> >> >> Question - Can it be done in Hive? If not, any suggestions. >> >> >> >> Thanks, >> >> Richin >> >> >> >> >> >>
+
John Meagher 2012-08-09, 14:10
-
RE: Converting rows into dynamic colums in Hive
richin.jain@... 2012-08-09, 20:16
Thanks John. Is there a way to do this is excel since I am outputting the table data in csv format (using macros or something else)? Richin -----Original Message----- From: ext John Meagher [mailto:[EMAIL PROTECTED]] Sent: Thursday, August 09, 2012 10:11 AM To: [EMAIL PROTECTED] Subject: Re: Converting rows into dynamic colums in Hive R is another programming language geared to statistical analysis. It has some features that make it easy to write this sort of data manipulation. http://www.r-project.org/Here's a sample R function that does something similar to what you need (note: I'm not a real R programmer, so this might be ugly, but it works): abCountsToMatrix <- function(csvFilename, outputmatrixfile){ data <- read.csv(csvFilename, as.is=T, check.names=F, header=F) cat("Loaded", nrow(data), "rows.\n") idrows <- sort(unique(c(data[ , 1]))) idcols <- sort(unique(c(data[ , 2]))) cat("Found", length(idrows), " X ", length(idcols), ".") output <- array(0, c(length(idrows), length(idcols))) rownames(output) <- idrows colnames(output) <- idcols for(i in 1:nrow(data)){ if (i %% 100 == 0) { cat("On ", i, " of ", nrow(data), "\n") } rowidx <- which(idrows == data[i, 1]) colidx <- which(idcols == data[i, 2]) count <- data[i,3] output[rowidx, colidx] <- count } write.csv(output, file=outputmatrixfile, quote=F) } On Wed, Aug 8, 2012 at 8:58 PM, <[EMAIL PROTECTED]> wrote: > John, > > What is R? > > -----Original Message----- > From: ext John Meagher [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, August 08, 2012 4:34 PM > To: [EMAIL PROTECTED] > Subject: Re: Converting rows into dynamic colums in Hive > > I don't think having dynamic columns is possible in Hive. I've always output from Hive a structure like your query output and used R to convert it into a dynamic column structure. > > > On Wed, Aug 8, 2012 at 3:56 PM, <[EMAIL PROTECTED]> wrote: >> Thanks Ashish, that gives an idea. >> >> But I am not sure about the outer select loop, I have to know all the >> values in Beta column beforehand to do a max on each value. >> >> Is there a better way? >> >> >> >> Richin >> >> >> >> From: ext Ashish Thusoo [mailto:[EMAIL PROTECTED]] >> Sent: Tuesday, August 07, 2012 5:05 PM >> To: [EMAIL PROTECTED] >> Subject: Re: Converting rows into dynamic colums in Hive >> >> >> >> you should be able to do this in hive using a group by on alpha and >> then using a combination of the max and if statement... something on >> the following lines >> >> select alpha, max(abc), max(pqr), ... >> ( >> select alpha, if (beta == 'abc', Gamma, NULL) as abc, if (beta == >> 'pqr', Gamma, NUL) as pqr, .... >> from table >> ) >> group by alpha >> >> something on those lines... >> >> Ashish >> >> On Tue, Aug 7, 2012 at 1:57 PM, <[EMAIL PROTECTED]> wrote: >> >> Hi All, >> >> >> >> One of my Query output looks like- >> >> >> >> Alpha Beta Gamma >> >> 123 xyz 1.0 >> >> 123 abc 0.5 >> >> 123 pqr 1.3 >> >> 123 def 2.1 >> >> 456 xyz 0.1 >> >> 456 abc 0.6 >> >> 456 pqr 1.9 >> >> 456 def 3.2 >> >> 456 lmn 1.1 >> >> 456 sdf 1.2 >> >> >> >> I want the output for the data visualization purpose to look like >> (basically taking rows from one table and making it column in another >> table) >> >> >> >> Alpha xyz abc pqr def lmn >> sdf >> >> 123 1.0 0.5 1.3 2.1
+
richin.jain@... 2012-08-09, 20:16
-
RE: Converting rows into dynamic colums in Hive
richin.jain@... 2012-08-16, 14:39
You could do it using Pivot table in MS Excel. It's under the Insert tab, first option on the left. Richin -----Original Message----- From: Jain Richin (Nokia-LC/Boston) Sent: Thursday, August 09, 2012 4:16 PM To: [EMAIL PROTECTED] Subject: RE: Converting rows into dynamic colums in Hive Thanks John. Is there a way to do this is excel since I am outputting the table data in csv format (using macros or something else)? Richin -----Original Message----- From: ext John Meagher [mailto:[EMAIL PROTECTED]] Sent: Thursday, August 09, 2012 10:11 AM To: [EMAIL PROTECTED] Subject: Re: Converting rows into dynamic colums in Hive R is another programming language geared to statistical analysis. It has some features that make it easy to write this sort of data manipulation. http://www.r-project.org/Here's a sample R function that does something similar to what you need (note: I'm not a real R programmer, so this might be ugly, but it works): abCountsToMatrix <- function(csvFilename, outputmatrixfile){ data <- read.csv(csvFilename, as.is=T, check.names=F, header=F) cat("Loaded", nrow(data), "rows.\n") idrows <- sort(unique(c(data[ , 1]))) idcols <- sort(unique(c(data[ , 2]))) cat("Found", length(idrows), " X ", length(idcols), ".") output <- array(0, c(length(idrows), length(idcols))) rownames(output) <- idrows colnames(output) <- idcols for(i in 1:nrow(data)){ if (i %% 100 == 0) { cat("On ", i, " of ", nrow(data), "\n") } rowidx <- which(idrows == data[i, 1]) colidx <- which(idcols == data[i, 2]) count <- data[i,3] output[rowidx, colidx] <- count } write.csv(output, file=outputmatrixfile, quote=F) } On Wed, Aug 8, 2012 at 8:58 PM, <[EMAIL PROTECTED]> wrote: > John, > > What is R? > > -----Original Message----- > From: ext John Meagher [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, August 08, 2012 4:34 PM > To: [EMAIL PROTECTED] > Subject: Re: Converting rows into dynamic colums in Hive > > I don't think having dynamic columns is possible in Hive. I've always output from Hive a structure like your query output and used R to convert it into a dynamic column structure. > > > On Wed, Aug 8, 2012 at 3:56 PM, <[EMAIL PROTECTED]> wrote: >> Thanks Ashish, that gives an idea. >> >> But I am not sure about the outer select loop, I have to know all the >> values in Beta column beforehand to do a max on each value. >> >> Is there a better way? >> >> >> >> Richin >> >> >> >> From: ext Ashish Thusoo [mailto:[EMAIL PROTECTED]] >> Sent: Tuesday, August 07, 2012 5:05 PM >> To: [EMAIL PROTECTED] >> Subject: Re: Converting rows into dynamic colums in Hive >> >> >> >> you should be able to do this in hive using a group by on alpha and >> then using a combination of the max and if statement... something on >> the following lines >> >> select alpha, max(abc), max(pqr), ... >> ( >> select alpha, if (beta == 'abc', Gamma, NULL) as abc, if (beta == >> 'pqr', Gamma, NUL) as pqr, .... >> from table >> ) >> group by alpha >> >> something on those lines... >> >> Ashish >> >> On Tue, Aug 7, 2012 at 1:57 PM, <[EMAIL PROTECTED]> wrote: >> >> Hi All, >> >> >> >> One of my Query output looks like- >> >> >> >> Alpha Beta Gamma >> >> 123 xyz 1.0 >> >> 123 abc 0.5 >> >> 123 pqr 1.3 >> >> 123 def 2.1 >> >> 456 xyz 0.1 >> >> 456 abc 0.6 >> >> 456 pqr 1.9 >> >> 456 def 3.2 >> >> 456 lmn 1.1 >> >> 456 sdf 1.2 >> >> >> >> I want the output for the data visualization purpose to look like
+
richin.jain@... 2012-08-16, 14:39
-
RE: Converting rows into dynamic colums in Hive
Raghunath, Ranjith 2012-08-07, 21:00
You can do this using case statements and summing the values. The only item to remember here is that the values in the grid need to be numeric for the sum to work.
Thanks, Ranjith
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Tuesday, August 07, 2012 3:58 PM To: [EMAIL PROTECTED] Subject: Converting rows into dynamic colums in Hive
Hi All,
One of my Query output looks like-
Alpha Beta Gamma 123 xyz 1.0 123 abc 0.5 123 pqr 1.3 123 def 2.1 456 xyz 0.1 456 abc 0.6 456 pqr 1.9 456 def 3.2 456 lmn 1.1 456 sdf 1.2
I want the output for the data visualization purpose to look like (basically taking rows from one table and making it column in another table)
Alpha xyz abc pqr def lmn sdf 123 1.0 0.5 1.3 2.1 456 0.1 0.6 1.9 3.2 1.1 1.2
Question - Can it be done in Hive? If not, any suggestions.
Thanks, Richin
+
Raghunath, Ranjith 2012-08-07, 21:00
|
|