Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Processing hierarchical information in Pig

Copy link to this message
Re: Processing hierarchical information in Pig
Prash, you can just model this tree as a simple graph adjacency list:


For nodes with more than one child, you simply extend each row
horizontally.  Child/parent/descendant/ancestor are straightforward
applications of a traversal on this graph (BFS would be a good choice).


On Wed, Feb 29, 2012 at 9:02 AM, prash987 prash987 <[EMAIL PROTECTED]>wrote:

> Hi All,
> How do I represent hierarchical information in flat file and process it in
> Pig?
> Let’s say I have objects of type A.
> I want to have a Tree  representation with their parent-child
> relationships.
> In scenario 1:
> A1 points to A2; A2 points to A3; A3 points to A4; A4 points to Am and
> so on till An.
> Given above definition; I want to be able to answer following :
> Child(A1) = A2
> Parent(A4) = A3
> Descendant(A1) = A2,A3,A4, Am… An
> Ancestor(A4) = A3,A2,A1
> Ancestor (An) = Am,…A4,A3,A2,A1
> Can this be represented in text file and queried in Pig.
> Appreciate any pointers/suggestions.
> Thanks!