Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Any reason a bunch of nearly-identical jobs would suddenly stop working?


Copy link to this message
-
Re: Any reason a bunch of nearly-identical jobs would suddenly stop working?

Also, reading some uncompressed data off the same cluster using
PigStorage shows a failure to even read the data in the first place :|

-K

On Tue, Mar 08, 2011 at 09:24:18PM -0500, Kris Coward wrote:
>
> None of the nodes have more than 20% utilization on any of their disks;
> so it must be the cluster figuring that it can get away with this sort
> of thing when the sysadmin's not around to set it straight.. clearly a
> cluster of redundant/load-sharing sysadmins is also needed :)
>
> -K
>
> On Tue, Mar 08, 2011 at 03:24:50PM -0800, Dmitriy Ryaboy wrote:
> > Check task logs. I am guessing you ran out of either hdfs or local disk on
> > the nodes.
> >
> > Also, never let your sysadmin go on vacation, that's what makes things
> > break! :)
> >
> > D
> >
> > On Tue, Mar 8, 2011 at 2:53 PM, Kris Coward <[EMAIL PROTECTED]> wrote:
> >
> > >
> > > So I queued up a batch of jobs last night to run overnight (and into the
> > > day a bit, owing to to a bottleneck on the scheduler the way that things
> > > are currently implemented), made sure they were running correctly, went
> > > to sleep, and when I woke up in the morning, they were failing all over
> > > the place.
> > >
> > > Since each of these jobs was basicaly the same pig script being run with
> > > a different set of parameters, I tried re-reunning it with the
> > > parameters that it had run (successfully) with the night before, and it
> > > also failed. So I started whittling away at steps to try and find the
> > > origin of the failure, until I was even getting a failure loading the
> > > initial data, and dumping it out. Basically, I've reduced things to a
> > > matter of
> > >
> > > apa = LOAD
> > > '/rawfiles/08556ecf5c6841d59eb702e9762e649a/{1296432000,1296435600,1296439200,1296442800,1296446400,1296450000,1296453600,1296457200,1296460800,1296464400,1296468000,1296471600,1296475200,1296478800,1296482400,1296486000,1296489600,1296493200,1296496800,1296500400,1296504000,1296507600,1296511200,1296514800}/*/apa'
> > > USING com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
> > > (timestamp:long, type:chararray, appkey:chararray, uid:chararray,
> > > uniq:chararray, shortUniq:chararray, profUid:chararray, addr:chararray,
> > > ref:chararray);
> > > dump apa;
> > >
> > > and after getting all the happy messages from the loader like:
> > >
> > > 2011-03-08 21:48:46,454 [Thread-12] INFO
> > > com.twitter.elephantbird.pig.load.LzoBaseLoadFunc - Got 117 LZO slices in
> > > total.
> > > 2011-03-08 21:48:48,044 [main] INFO
> > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - 0% complete
> > > 2011-03-08 21:50:17,612 [main] INFO
> > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - 100% complete
> > >
> > > It went straight to:
> > >
> > > 2011-03-08 21:50:17,612 [main] ERROR
> > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - 1 map reduce job(s) failed!
> > > 2011-03-08 21:50:17,662 [main] ERROR
> > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - Failed to produce result in:
> > > "hdfs://master.hadoop:9000/tmp/temp-2121884028/tmp-268519128"
> > > 2011-03-08 21:50:17,664 [main] INFO
> > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - Failed!
> > > 2011-03-08 21:50:17,668 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> > > ERROR 1066: Unable to open iterator for alias apa
> > > Details at logfile: /home/kris/pig_1299620898192.log
> > >
> > > And looking at the stack trace in the logfile, I've got:
> > >
> > > Pig Stack Trace
> > > ---------------
> > > ERROR 1066: Unable to open iterator for alias apa
> > >
> > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to
> > > open iterator for alias apa
> > >        at org.apache.pig.PigServer.openIterator(PigServer.java:482)
> > >        at
> > > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:539)

Kris Coward http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB