Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Any reason a bunch of nearly-identical jobs would suddenly stop working?


+
Kris Coward 2011-03-08, 22:53
+
Dmitriy Ryaboy 2011-03-08, 23:24
+
Kris Coward 2011-03-09, 02:24
+
Kris Coward 2011-03-09, 22:29
Copy link to this message
-
Re: Any reason a bunch of nearly-identical jobs would suddenly stop working?
Question, do normal map-reduce jobs run on this cluster? Like the example jar jobs?
Guy

On Mar 9, 2011, at 2:29 PM, Kris Coward <[EMAIL PROTECTED]> wrote:

>
> Also, reading some uncompressed data off the same cluster using
> PigStorage shows a failure to even read the data in the first place :|
>
> -K
>
> On Tue, Mar 08, 2011 at 09:24:18PM -0500, Kris Coward wrote:
>>
>> None of the nodes have more than 20% utilization on any of their disks;
>> so it must be the cluster figuring that it can get away with this sort
>> of thing when the sysadmin's not around to set it straight.. clearly a
>> cluster of redundant/load-sharing sysadmins is also needed :)
>>
>> -K
>>
>> On Tue, Mar 08, 2011 at 03:24:50PM -0800, Dmitriy Ryaboy wrote:
>>> Check task logs. I am guessing you ran out of either hdfs or local disk on
>>> the nodes.
>>>
>>> Also, never let your sysadmin go on vacation, that's what makes things
>>> break! :)
>>>
>>> D
>>>
>>> On Tue, Mar 8, 2011 at 2:53 PM, Kris Coward <[EMAIL PROTECTED]> wrote:
>>>
>>>>
>>>> So I queued up a batch of jobs last night to run overnight (and into the
>>>> day a bit, owing to to a bottleneck on the scheduler the way that things
>>>> are currently implemented), made sure they were running correctly, went
>>>> to sleep, and when I woke up in the morning, they were failing all over
>>>> the place.
>>>>
>>>> Since each of these jobs was basicaly the same pig script being run with
>>>> a different set of parameters, I tried re-reunning it with the
>>>> parameters that it had run (successfully) with the night before, and it
>>>> also failed. So I started whittling away at steps to try and find the
>>>> origin of the failure, until I was even getting a failure loading the
>>>> initial data, and dumping it out. Basically, I've reduced things to a
>>>> matter of
>>>>
>>>> apa = LOAD
>>>> '/rawfiles/08556ecf5c6841d59eb702e9762e649a/{1296432000,1296435600,1296439200,1296442800,1296446400,1296450000,1296453600,1296457200,1296460800,1296464400,1296468000,1296471600,1296475200,1296478800,1296482400,1296486000,1296489600,1296493200,1296496800,1296500400,1296504000,1296507600,1296511200,1296514800}/*/apa'
>>>> USING com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
>>>> (timestamp:long, type:chararray, appkey:chararray, uid:chararray,
>>>> uniq:chararray, shortUniq:chararray, profUid:chararray, addr:chararray,
>>>> ref:chararray);
>>>> dump apa;
>>>>
>>>> and after getting all the happy messages from the loader like:
>>>>
>>>> 2011-03-08 21:48:46,454 [Thread-12] INFO
>>>> com.twitter.elephantbird.pig.load.LzoBaseLoadFunc - Got 117 LZO slices in
>>>> total.
>>>> 2011-03-08 21:48:48,044 [main] INFO
>>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>>>> - 0% complete
>>>> 2011-03-08 21:50:17,612 [main] INFO
>>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>>>> - 100% complete
>>>>
>>>> It went straight to:
>>>>
>>>> 2011-03-08 21:50:17,612 [main] ERROR
>>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>>>> - 1 map reduce job(s) failed!
>>>> 2011-03-08 21:50:17,662 [main] ERROR
>>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>>>> - Failed to produce result in:
>>>> "hdfs://master.hadoop:9000/tmp/temp-2121884028/tmp-268519128"
>>>> 2011-03-08 21:50:17,664 [main] INFO
>>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>>>> - Failed!
>>>> 2011-03-08 21:50:17,668 [main] ERROR org.apache.pig.tools.grunt.Grunt -
>>>> ERROR 1066: Unable to open iterator for alias apa
>>>> Details at logfile: /home/kris/pig_1299620898192.log
>>>>
>>>> And looking at the stack trace in the logfile, I've got:
>>>>
>>>> Pig Stack Trace
>>>> ---------------
>>>> ERROR 1066: Unable to open iterator for alias apa
>>>>
>>>> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to
>>>> open iterator for alias apa
>>>>       at org.apache.pig.PigServer.openIterator(PigServer.java:482)
+
Mridul Muralidharan 2011-03-10, 01:29
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB