Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Determining the group-by column


Copy link to this message
-
RE: Determining the group-by column
Cogroup has inner plans that compute the group by attributes. Instead of
looking at the predecessor(s), you should navigate the inner plan of
cogroup. Check out the code in
src/org/apache/pig/impl/logicalLayer/validators/TypeCheckingVisitor.java
(visit(LOCogroup ...) method)

Santhosh

-----Original Message-----
From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]]
Sent: Sunday, February 15, 2009 9:07 PM
To: [EMAIL PROTECTED]
Subject: Determining the group-by column

We are working on the Pig Logical Optimizer, and running into some
difficulty navigating the plan.

If we run explain on a query with a CoGroup, we get something like:

Cogroup
|    |
|    |-- Project [0]
|
|------ ForEach
            |  <etc>

What we want to do is determine that this particular Cogroup operates on
a
projection of field 0.

If we create a new LogicalTransformer that is applied to Cogroup
operators,
and call

mPlan.getPredecessors(ourCogroupOperator) , we only get the ForEach.
Calling getSuccessors results in a null being returned (Cogroup is
indeed
the root).

How do we find the Project operator above? What is its relationship,
plan-wise, with the Cogroup operator?

Thanks a lot,

Dmitriy Ryaboy, Ashutosh Chauhan, Tejal Desai