Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Alias Confusion


+
Chun Yang 2012-07-06, 19:41
+
Chun Yang 2012-07-06, 19:55
Copy link to this message
-
Re: Alias Confusion
Don't duplicate relation names as column names.

Russell Jurney http://datasyndrome.com

On Jul 6, 2012, at 12:56 PM, Chun Yang <[EMAIL PROTECTED]> wrote:

> Hi all,
>
> I'm walking through a pig script in grunt, but I am getting stuck with some
> issues using nested foreach. I'm using Pig version 0.9.2
>
> I'm trying to find the number of unique users from a bag 'top100'
>
> grunt> describe top100
> top100: {name: chararray,licenses: long,instance: chararray,transactions:
> long,users: {(projected::userId: chararray)},runTimes: {(projected::runTime:
> double)}}
>
> grunt> uu = foreach top100 {
>>> uniqUsers = distinct users;
>>> generate uniqUsers as uniqUsers;
>>> }
> ERROR 1200: Pig script failed to parse:
> <line 132, column 9> Invalid scalar projection: uniqUsers : A column needs
> to be projected from a relation for it to be used as a scalar
>
> I realized that I had defined uniqUsers earlier, but I didn't think it would
> conflict inside the nested foreach block. The schema for uniqUsers is:
>
> grunt> describe uniqUsers
> uniqUsers: {key: chararray,uniqUsers: long}
>
> I tried a different alias for the distinct clause and it seems to work.
>
> grunt> uu = foreach top100 {
>>> un = distinct users;
>>> generate un as uniqUsers;
>>> }
> grunt> describe uu
> uu: {un: {(projected::userId: chararray)}}
> grunt> uu = foreach top100 {
>>> un = distinct users;
>>> generate COUNT(un) as uniqUsers;
>>> }
> grunt> describe uu
> uu: {uniqUsers: long}
>
> I was curious, so I tried the following, but I do not understand what the
> results are.
>
> grunt> u2 = foreach top100 {
>>> uniqUsers = distinct users;
>>> generate uniqUsers.key;
>>> }
> grunt> describe u2
> u2: {projected::userId: chararray}
>
> grunt> u3 = foreach top100 {
>>> uniqUsers = distinct users;
>>> generate uniqUsers.uniqUsers;
>>> }
> grunt> describe u3
> u3: {projected::userId: chararray}
>
> Specifically, what is actually in the result of u3? Why is it a chararray
> when uniqUsers.uniqUsers is a long? Why is the alias still
> projected::userId?
>
> Thanks for any help!
>
> -Chun
>
> PS Sorry for the double post, I accidentally hit a keyboard shortcut for
> Send.
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB