Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> simple script generating 'too many counters' error


+
Lauren Blau 2013-04-04, 13:25
+
Dmitriy Ryaboy 2013-04-04, 20:54
+
Lauren Blau 2013-04-04, 23:00
+
Lauren Blau 2013-04-04, 23:01
+
Lauren Blau 2013-04-05, 14:13
Copy link to this message
-
Re: simple script generating 'too many counters' error
How many mappers and reducers do you have? Skimming the Rank code it looks
like it creates at least N counters per task which would be a scalability
bug.

On Friday, April 5, 2013, Lauren Blau wrote:

> this is defintely caused by the RANK operator. Is there some way to reduce
> the number of counters generated by this operator when working with large
> data?
> thanks
>
> On Thu, Apr 4, 2013 at 7:01 PM, Lauren Blau <
> [EMAIL PROTECTED] <javascript:;>> wrote:
>
> > I can think of only 2 things that have changed since this script last ran
> > successfully. Switched to using the range specification of the schema for
> > a2, and the input data has grown considerably.
> >
> > Lauren
> >
> >
> > On Thu, Apr 4, 2013 at 7:00 PM, Lauren Blau <
> > [EMAIL PROTECTED] <javascript:;>> wrote:
> >
> >> no
> >>
> >>
> >> On Thu, Apr 4, 2013 at 4:54 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]<javascript:;>
> >wrote:
> >>
> >>> Do you have any special properties set?
> >>> Like the pig.udf.profile one maybe..
> >>> D
> >>>
> >>>
> >>> On Thu, Apr 4, 2013 at 6:25 AM, Lauren Blau <
> >>> [EMAIL PROTECTED] <javascript:;>> wrote:
> >>>
> >>> > I'm running a simple script to add a sequence_number to a relation,
> >>> sort
> >>> > the result and store to a file:
> >>> >
> >>> > a0 = load '<filename>' using PigStorage('\t','-schema');
> >>> > a1 = rank a0;
> >>> > a2 = foreach a1 generate col1 .. col16 , rank_a0 as sequence_number;
> >>> > a3 = order a2 by  sequence_number;
> >>> > store a3 into 'outputfile' using PigStorage('\t','-schema');
> >>> >
> >>> > I get the following error:
> >>> > org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many
> >>> > counters: 241 max=240
> >>> >     at
> >>> >
> >>>
> org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:61)
> >>> >     at
> >>> >
> >>>
> org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:68)
> >>> >     at
> >>> >
> >>> >
> >>>
> org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.readFields(AbstractCounterGroup.java:174)
> >>> >     at
> >>> > org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java:278)
> >>> >     at
> >>> >
> >>> >
> >>>
> org.apache.hadoop.mapreduce.counters.AbstractCounters.readFields(AbstractCounters.java:303)
> >>> >     at
> >>> >
> org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:280)
> >>> >     at
> >>> >
> org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:75)
> >>> >     at
> >>> >
> >>>
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:951)
> >>> >     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:835)
> >>> >
> >>> >
> >>> > we aren't able to up our counters any higher (policy) and I don't
> >>> > understand why I should need so many counters for such a simple
> script
> >>> > anyway?
> >>> > running Apache Pig version 0.11.1-SNAPSHOT (r: unknown)
> >>> > compiled Mar 22 2013, 10:19:19
> >>> >
> >>> > Can someone help?
> >>> >
> >>> > Thanks,
> >>> > Lauren
> >>> >
> >>>
> >>
> >>
> >
>
--
Sent from Gmail Mobile
+
Lauren Blau 2013-04-05, 15:40
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB