Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Commands not working properly when stored in pig file


Copy link to this message
-
Re: Commands not working properly when stored in pig file
Hi, Mix:
" second map reduce started executing before first one got completed"
Interesting. Since you just do LOAD for evnt_dtl, without DUMP or STORE it,
Pig shouldn't do anything, especially before STORE command complete.

I have below script and it works fine. So think root cause is something
else. Unless your data is very big?
a = load 'words_and_numbers' as (f1:chararray, f2:chararray);
b = filter a by f1 is not null;
store (foreach (group b all) generate flatten($1)) into 'multipleload/tmp';
c = load 'multipleload/tmp/part-r-00000' as (f3:chararray, f4:chararray);
dump c;

Johnny
On Wed, Mar 27, 2013 at 4:07 PM, Mix Nin <[EMAIL PROTECTED]> wrote:

> I guess the second map reduce started executing before first one got
> completed.  Below is error log
>
> 2013-03-27 15:48:08,902 [main] INFO
>
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - creating jar file Job4695026384513564120.jar
> 2013-03-27 15:48:13,983 [main] INFO
>
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - jar file Job4695026384513564120.jar created
> 2013-03-27 15:48:13,993 [main] INFO
>
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - Setting up single store job
> 2013-03-27 15:48:14,052 [main] INFO
>
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 2 map-reduce job(s) waiting for submission.
>
> Failed Jobs:
> JobId   Alias   Feature Message Outputs
> N/A     1-18,1-19,FACT_PXP_EVNT_DTL,evnt_dtl    GROUP_BY        Message:
> org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input
> path does not exist: hdfs:///user/lnindrakrishna/exp/part-r-00000
>
> When I run the scripts individually in grunt shell one by one, i don't see
> this problem
>
>
> On Wed, Mar 27, 2013 at 3:45 PM, Mix Nin <[EMAIL PROTECTED]> wrote:
>
> > yes the file exists in HDFS.
> >
> >
> > On Wed, Mar 27, 2013 at 3:16 PM, Johnny Zhang <[EMAIL PROTECTED]
> >wrote:
> >
> >> Mix,
> >> 'null' is the failed job ID. From what I can tell, there is only one
> STORE
> >> command and it actually fail, so MapReduceLauncher tries to stop
> >> all dependent jobs, that's why the message is trhown. Can you double
> check
> >> if the file exists in HDFS?
> >>
> >> Johnny
> >>
> >>
> >> On Wed, Mar 27, 2013 at 2:58 PM, Mix Nin <[EMAIL PROTECTED]> wrote:
> >>
> >> > Sorry for posting same issue multiple times
> >> >
> >> > I  wrote a pig script as follows and stored it in x.pig file
> >> >
> >> > Data = LOAD '/....' as (,,,, )
> >> > NoNullData= FILTER Data by qe is not null;
> >> > STORE (foreach (group NoNullData all) generate flatten($1))  into
> >> > 'exp/$inputDatePig';
> >> >
> >> >
> >> > evnt_dtl =LOAD 'exp/$inputDatePig/part-r-00000' AS (cust,,,,,)
> >> >
> >> >
> >> >
> >> > I executed the command as follows
> >> >
> >> > pig  -f x.pig -param inputDatePig=03272013
> >> >
> >> >
> >> > And  finally it says exp/03272013 tough the directory exists as it
> gets
> >> > created in STORE command.
> >> >
> >> > What is wrong in this
> >> >
> >> >
> >> > This is the error I get
> >> >
> >> >
> >> >
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> >> > - 32% complete
> >> > 2013-03-27 14:38:35,568 [main] INFO
> >> >
> >> >
> >>
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> >> > - 50% complete
> >> > 2013-03-27 14:38:45,731 [main] INFO
> >> >
> >> >
> >>
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> >> > - job null has failed! Stop running all dependent jobs
> >> > 2013-03-27 14:38:45,731 [main] INFO
> >> >
> >> >
> >>
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> >> > - 100% complete
> >> > 2013-03-27 14:38:45,734 [main] ERROR
> >> > org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to
> >> > recreate exception from backend error:
> >> > org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB