Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - Alias Confusion


+
Chun Yang 2012-07-06, 19:41
+
Chun Yang 2012-07-06, 19:55
Copy link to this message
-
Re: Alias Confusion
Russell Jurney 2012-07-06, 21:41
Don't duplicate relation names as column names.

Russell Jurney http://datasyndrome.com

On Jul 6, 2012, at 12:56 PM, Chun Yang <[EMAIL PROTECTED]> wrote:

> Hi all,
>
> I'm walking through a pig script in grunt, but I am getting stuck with some
> issues using nested foreach. I'm using Pig version 0.9.2
>
> I'm trying to find the number of unique users from a bag 'top100'
>
> grunt> describe top100
> top100: {name: chararray,licenses: long,instance: chararray,transactions:
> long,users: {(projected::userId: chararray)},runTimes: {(projected::runTime:
> double)}}
>
> grunt> uu = foreach top100 {
>>> uniqUsers = distinct users;
>>> generate uniqUsers as uniqUsers;
>>> }
> ERROR 1200: Pig script failed to parse:
> <line 132, column 9> Invalid scalar projection: uniqUsers : A column needs
> to be projected from a relation for it to be used as a scalar
>
> I realized that I had defined uniqUsers earlier, but I didn't think it would
> conflict inside the nested foreach block. The schema for uniqUsers is:
>
> grunt> describe uniqUsers
> uniqUsers: {key: chararray,uniqUsers: long}
>
> I tried a different alias for the distinct clause and it seems to work.
>
> grunt> uu = foreach top100 {
>>> un = distinct users;
>>> generate un as uniqUsers;
>>> }
> grunt> describe uu
> uu: {un: {(projected::userId: chararray)}}
> grunt> uu = foreach top100 {
>>> un = distinct users;
>>> generate COUNT(un) as uniqUsers;
>>> }
> grunt> describe uu
> uu: {uniqUsers: long}
>
> I was curious, so I tried the following, but I do not understand what the
> results are.
>
> grunt> u2 = foreach top100 {
>>> uniqUsers = distinct users;
>>> generate uniqUsers.key;
>>> }
> grunt> describe u2
> u2: {projected::userId: chararray}
>
> grunt> u3 = foreach top100 {
>>> uniqUsers = distinct users;
>>> generate uniqUsers.uniqUsers;
>>> }
> grunt> describe u3
> u3: {projected::userId: chararray}
>
> Specifically, what is actually in the result of u3? Why is it a chararray
> when uniqUsers.uniqUsers is a long? Why is the alias still
> projected::userId?
>
> Thanks for any help!
>
> -Chun
>
> PS Sorry for the double post, I accidentally hit a keyboard shortcut for
> Send.
>