Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - What is the best way to do counting in pig?


+
Sheng Guo 2012-07-02, 18:42
+
Jonathan Coveney 2012-07-02, 20:20
Copy link to this message
-
Re: What is the best way to do counting in pig?
Sheng Guo 2012-07-02, 20:34
No. I try to figure out how many records (rows) in 'm_skill_group' table.
(That limit statement actually is not necessary)

Thanks!
On Mon, Jul 2, 2012 at 1:20 PM, Jonathan Coveney <[EMAIL PROTECTED]> wrote:

> Is your goal to have the 10 largest rows by member_id?
>
> 2012/7/2 Sheng Guo <[EMAIL PROTECTED]>
>
> > Hi all,
> >
> > I used to use the following pig script to do the counting of the records.
> >
> > m_skill_group = group m_skills_filter by member_id;
> > grpd = group m_skill_group all;
> > cnt = foreach grpd generate COUNT(m_skill_group);
> >
> > cnt_filter = limit cnt 10;
> > dump cnt_filter;
> >
> >
> > but sometimes, when the records get larger, it takes lots of time and
> hang
> > up, and or die.
> > I thought counting should be simple enough, so what is the best way to
> do a
> > counting in pig?
> >
> > Thanks!
> >
> > Sheng
> >
>
+
Jonathan Coveney 2012-07-02, 20:41
+
Subir S 2012-07-02, 20:51
+
Sheng Guo 2012-07-02, 21:32
+
Jonathan Coveney 2012-07-02, 21:31
+
Subir S 2012-07-02, 21:35
+
Ruslan Al-Fakikh 2012-07-03, 10:03
+
Jonathan Coveney 2012-07-03, 16:56
+
Sheng Guo 2012-07-09, 22:54
+
Haitao Yao 2012-07-10, 02:28
+
Jonathan Coveney 2012-07-10, 04:26
+
Haitao Yao 2012-07-10, 05:06
+
Haitao Yao 2012-07-10, 08:20
+
Haitao Yao 2012-07-11, 06:50
+
Jonathan Coveney 2012-07-11, 16:53
+
Thejas Nair 2012-07-12, 00:56
+
Haitao Yao 2012-07-12, 02:56
+
Jonathan Coveney 2012-07-12, 02:58
+
Haitao Yao 2012-08-09, 14:51