Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Access only data from LEFT OUTER JOIN side of joined data without projection prefix


Copy link to this message
-
Re: Access only data from LEFT OUTER JOIN side of joined data without projection prefix
Basically you need to transform the schema, not the data.  The easiest way I can think of to do that is to use a UDF that has an outputSchema function that renames columns.  The exec call can then be a simple pass through.  

If you wanted to you could have it consolidate the join keys.  You imply you would like to consolidate other columns as well (A::E::time in your example), but that is not valid.  Since time is not a join key it will not necessarily be the same in A and E.

Alan.

On Jul 25, 2012, at 2:48 AM, Florian Zumkeller-Quast wrote:

> Hello,
> I got the following code:
>
> A = LOAD '§file1' USING AvroStorage();
> B = LOAD '$file2' USING AvroStorage();
> C = JOIN A BY id LEFT OUTER, B BY id;
> SPLIT C INTO D IF B::id IS NULL, E OTHERWISE;
>
> DESCRIBE shows the following data structure
>
> D: {A::id: long,A::time: int,B::id: long,B::time: int}
> E: {A::id: long,A::time: int,B::id: long,B::time: int}
>
> But i can't store D and E using AvroStorage because the filed names contain
> "::" which is not an allowed character.
>
> I need  structure like
> F: {id: long,time: int}
> where id = E::A::id and time = E::A::time.
>
> The problem is: The number, name and type of fields may vary.
>
> So E might looks like
> E: {A::id: long,A::time: int,A::fieldN1,B::id: long,B::time: int,B::fieldN1 int}
>
> Thus I can't use
>
> F = FOREACH … GENERATE …;
>
> because i don't want to write code for each filetype as long as I don't really
> need to.
>
> Can someone give me an advice how to get the result I need?
>
> Thanks!
>
> With kind regards
> Florian Zumkeller-Quast
> --
> Developer
> ________________________________________________________
>
> ADITION technologies AG
> Schwarzwaldstraße 78b
> 79117 Freiburg
>
> http://www.adition.com
>
> T +49 / (0)761 / 88147 - 30
> F +49 / (0)761 / 88147 - 77
> SUPPORT +49  / (0)1805 - ADITION
>
> (Festnetzpreis 14 ct/min; Mobilfunkpreise maximal 42 ct/min)
>
> Eingetragen beim Amtsgericht Düsseldorf unter HRB 54076
> Vorstände: Andreas Kleiser, Jörg Klekamp, Tihomir Perkovic, Marcus Schlüter
> Aufsichtsratsvorsitzender: Rechtsanwalt Daniel Raimer
> UStIDNr.: DE 218 858 434
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB