Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> count duplicate entries


Copy link to this message
-
count duplicate entries
Hi,
 I have data in hdfs like:

id1,field1,field2
1,2,3
1,2,3
1,2,4
1,2,5
I want to find the number of unique entries using pig..
So here, number of unique entries are 3 ( as 1,2,3 is repeated twice)

How do i find this?

Thanks
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB