Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - simple script generating 'too many counters' error


+
Lauren Blau 2013-04-04, 13:25
+
Dmitriy Ryaboy 2013-04-04, 20:54
+
Lauren Blau 2013-04-04, 23:00
+
Lauren Blau 2013-04-04, 23:01
+
Lauren Blau 2013-04-05, 14:13
Copy link to this message
-
Re: simple script generating 'too many counters' error
Bill Graham 2013-04-05, 14:47
How many mappers and reducers do you have? Skimming the Rank code it looks
like it creates at least N counters per task which would be a scalability
bug.

On Friday, April 5, 2013, Lauren Blau wrote:

> this is defintely caused by the RANK operator. Is there some way to reduce
> the number of counters generated by this operator when working with large
> data?
> thanks
>
> On Thu, Apr 4, 2013 at 7:01 PM, Lauren Blau <
> [EMAIL PROTECTED] <javascript:;>> wrote:
>
> > I can think of only 2 things that have changed since this script last ran
> > successfully. Switched to using the range specification of the schema for
> > a2, and the input data has grown considerably.
> >
> > Lauren
> >
> >
> > On Thu, Apr 4, 2013 at 7:00 PM, Lauren Blau <
> > [EMAIL PROTECTED] <javascript:;>> wrote:
> >
> >> no
> >>
> >>
> >> On Thu, Apr 4, 2013 at 4:54 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]<javascript:;>
> >wrote:
> >>
> >>> Do you have any special properties set?
> >>> Like the pig.udf.profile one maybe..
> >>> D
> >>>
> >>>
> >>> On Thu, Apr 4, 2013 at 6:25 AM, Lauren Blau <
> >>> [EMAIL PROTECTED] <javascript:;>> wrote:
> >>>
> >>> > I'm running a simple script to add a sequence_number to a relation,
> >>> sort
> >>> > the result and store to a file:
> >>> >
> >>> > a0 = load '<filename>' using PigStorage('\t','-schema');
> >>> > a1 = rank a0;
> >>> > a2 = foreach a1 generate col1 .. col16 , rank_a0 as sequence_number;
> >>> > a3 = order a2 by  sequence_number;
> >>> > store a3 into 'outputfile' using PigStorage('\t','-schema');
> >>> >
> >>> > I get the following error:
> >>> > org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many
> >>> > counters: 241 max=240
> >>> >     at
> >>> >
> >>>
> org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:61)
> >>> >     at
> >>> >
> >>>
> org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:68)
> >>> >     at
> >>> >
> >>> >
> >>>
> org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.readFields(AbstractCounterGroup.java:174)
> >>> >     at
> >>> > org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java:278)
> >>> >     at
> >>> >
> >>> >
> >>>
> org.apache.hadoop.mapreduce.counters.AbstractCounters.readFields(AbstractCounters.java:303)
> >>> >     at
> >>> >
> org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:280)
> >>> >     at
> >>> >
> org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:75)
> >>> >     at
> >>> >
> >>>
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:951)
> >>> >     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:835)
> >>> >
> >>> >
> >>> > we aren't able to up our counters any higher (policy) and I don't
> >>> > understand why I should need so many counters for such a simple
> script
> >>> > anyway?
> >>> > running Apache Pig version 0.11.1-SNAPSHOT (r: unknown)
> >>> > compiled Mar 22 2013, 10:19:19
> >>> >
> >>> > Can someone help?
> >>> >
> >>> > Thanks,
> >>> > Lauren
> >>> >
> >>>
> >>
> >>
> >
>
--
Sent from Gmail Mobile
+
Lauren Blau 2013-04-05, 15:40