Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Processing hierarchical information in Pig


Copy link to this message
-
Re: Processing hierarchical information in Pig
Norbert Burger 2012-02-29, 16:46
Prash, you can just model this tree as a simple graph adjacency list:

A1,A2
A2,A3
A3,A4
A4,Am
...

For nodes with more than one child, you simply extend each row
horizontally.  Child/parent/descendant/ancestor are straightforward
applications of a traversal on this graph (BFS would be a good choice).

Norbert

On Wed, Feb 29, 2012 at 9:02 AM, prash987 prash987 <[EMAIL PROTECTED]>wrote:

> Hi All,
> How do I represent hierarchical information in flat file and process it in
> Pig?
>
> Let’s say I have objects of type A.
> I want to have a Tree  representation with their parent-child
> relationships.
>
> In scenario 1:
> A1 points to A2; A2 points to A3; A3 points to A4; A4 points to Am and
> so on till An.
> Given above definition; I want to be able to answer following :
>
> Child(A1) = A2
> Parent(A4) = A3
> Descendant(A1) = A2,A3,A4, Am… An
> Ancestor(A4) = A3,A2,A1
> Ancestor (An) = Am,…A4,A3,A2,A1
>
> Can this be represented in text file and queried in Pig.
>
> Appreciate any pointers/suggestions.
> Thanks!
>