Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> explode operation


Copy link to this message
-
explode operation
Hi Guys,

I came across a use case that seems to require an 'explode' operation
which to my knowledge is not currently available.
That is, given a tuple (x,y,z), 'explode' would generate the tuples
(x), (y), (z).

E.g., consider a relation that contains an arbitrary number of
different identifier columns, say,
social security id, student id, etc.  We want to compute the set of
all distinct identifiers.  Assume that the number of identifier
columns is large and intermingled with other
columns that should be projected out; this is to avoid a solution
using 'SPLIT', e.g.

To be concrete, if X = {(..., 2, 4, ..., 3), (..., 2,,...,5)} is such
a relation, then the answer we want is
Y={2,3,4,5}.

Any suggestions?

Thanks,

stan
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB