Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Determining the group-by column

Copy link to this message
RE: Determining the group-by column
Cogroup has inner plans that compute the group by attributes. Instead of
looking at the predecessor(s), you should navigate the inner plan of
cogroup. Check out the code in
(visit(LOCogroup ...) method)


-----Original Message-----
From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]]
Sent: Sunday, February 15, 2009 9:07 PM
Subject: Determining the group-by column

We are working on the Pig Logical Optimizer, and running into some
difficulty navigating the plan.

If we run explain on a query with a CoGroup, we get something like:

|    |
|    |-- Project [0]
|------ ForEach
            |  <etc>

What we want to do is determine that this particular Cogroup operates on
projection of field 0.

If we create a new LogicalTransformer that is applied to Cogroup
and call

mPlan.getPredecessors(ourCogroupOperator) , we only get the ForEach.
Calling getSuccessors results in a null being returned (Cogroup is
the root).

How do we find the Project operator above? What is its relationship,
plan-wise, with the Cogroup operator?

Thanks a lot,

Dmitriy Ryaboy, Ashutosh Chauhan, Tejal Desai