Many operators, such as join and group by, are not implemented by a single physical operation. Also, they are spread through the code as they have logical components and physical components. The logical components of join are in org.apache.pig.newplan.logical.relational.LOJoin.java. That gets translated to three physical operators, POLocalRearrange, POPackage, and POForeach. All of the physical operators are in org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators
On Oct 5, 2012, at 11:01 AM, Brian Stempin wrote:
> Thanks Russell -- That's really useful.
> Just for kicks and giggles: Where would I look in the code base to see how the JOIN keyword is implemented? I've found the built in functions, but not the keywords (JOIN, GROUP, etc). Perhaps that would give me some hints. Perhaps it'll show me that a UDF might not be the best option for my set of problems.
> Thanks once again,
> This e-mail is intended solely for the above-mentioned recipient and it may contain confidential or privileged information. If you have received it in error, please notify us immediately and delete the e-mail. You must not copy, distribute, disclose or take any action in reliance on it. In addition, the contents of an attachment to this e-mail may contain software viruses which could damage your own computer system. While ColdLight Solutions, LLC has taken every reasonable precaution to minimize this risk, we cannot accept liability for any damage which you sustain as a result of software viruses. You should perform your own virus checks before opening the attachment.