Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> More on issue with local vs mapreduce mode


Copy link to this message
-
RE: More on issue with local vs mapreduce mode
Show the code if your func. How did you define input and output schema?
07.11.2013 2:09 пользователь "Sameer Tilak" <[EMAIL PROTECTED]> написал:

> Dear Serega,
> I am now using log4j for debugging my UDF. Here is what I found out. For
> some reason in mapreduce mode my exec function does not get called.   The
> log message in the constructor gets printed onto the console. However, the
> log message in the exec funciton does not get printed to the console. In
> local mode all the log messages can be seen on the console.
>
>  public Tuple exec(Tuple input) throws IOException {
>         log.info("Into the exec function");
>         // rest of the code
> }
>
>
> public CustomFilter ()
>             {
>
>                 log.info("Hello World!");
>         }
>
>
> > From: [EMAIL PROTECTED]
> > Date: Wed, 6 Nov 2013 21:19:10 +0400
> > Subject: Re: More on issue with local vs mapreduce mode
> > To: [EMAIL PROTECTED]
> >
> > You get 4 empty tuples.
> > Maybe your UDF parser.customFilter(key,'AAAAA') works differently? Maybe
> > you use the old version?
> > You can add print statement to UDF and see what does it accept and what
> > does produce.
> >
> >
> > 2013/11/6 Sameer Tilak <[EMAIL PROTECTED]>
> >
> > > Dear Serega,
> > >
> > > When I run the script in local mode, I get correct o/p stored in
> > > AU/part-m-000 file. However, when I run it in the mapreduce mode (with
> i/p
> > > and o/p from HDFS), the file /scratch/AU/part-m-000 is of size 4 and
> there
> > > is nothing in it.
> > >
> > > I am not sure whether AU relation somehow does not get realized
> correctly
> > > or the problem happens during the storing stage.
> > >
> > > Here are some of  the console messages:
> > >
> > > HadoopVersion    PigVersion    UserId    StartedAt    FinishedAt
> > >  Features
> > > 1.0.3    0.11.1    p529444    2013-11-06 07:14:15    2013-11-06
> 07:14:40
> > >  UNKNOWN
> > >
> > > Success!
> > >
> > > Job Stats (time in seconds):
> > > JobId    Maps    Reduces    MaxMapTime    MinMapTIme    AvgMapTime
> > >  MedianMapTime    MaxReduceTime    MinReduceTime    AvgReduceTime
> > >  MedianReducetime    Alias    Feature    Outputs
> > > job_201311011343_0042    1    0    6    6    6    6    0    0    0    0
> > >  A,AU    MULTI_QUERY,MAP_ONLY    /scratch/A,/scratch/AU,
> > >
> > > Input(s):
> > > Successfully read 4 records (311082 bytes) from: "/scratch/file.seq"
> > >
> > > Output(s):
> > > Successfully stored 4 records (231 bytes) in: "/scratch/A"
> > > Successfully stored 4 records (3 bytes) in: "/scratch/AU"
> > >
> > > Counters:
> > > Total records written : 8
> > > Total bytes written : 234
> > > Spillable Memory Manager spill count : 0
> > > Total bags proactively spilled: 0
> > > Total records proactively spilled: 0
> > >
> > > Job DAG:
> > > job_201311011343_0042
> > >
> > >
> > > 2013-11-06 07:14:40,352 [main] INFO
> > >
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - Success!
> > >
> > > Here is the o/p of following commands:
> > >
> > > hadoop --config $HADOOP_CONF_DIR fs -ls /scratch/AU/part-m-00000
> > >
> > > -rw-r--r--   1 username groupname          4 2013-11-06 07:14
> > > /scratch/AU/part-m-00000
> > >
> > > I am not sure whether DUMP will give me correct result, but when I
> > > replaced store by dump AU in the mapredue mode then I get
> > > AU as:
> > > ()
> > > ()
> > > ()
> > > ()
> > >
> > > > From: [EMAIL PROTECTED]
> > > > Date: Wed, 6 Nov 2013 11:19:03 +0400
> > > > Subject: Re: More on issue with local vs mapreduce mode
> > > > To: [EMAIL PROTECTED]
> > > >
> > > > "The same script does not work in the mapreduce mode. "
> > > > What does it mean?
> > > >
> > > >
> > > > 2013/11/6 Sameer Tilak <[EMAIL PROTECTED]>
> > > >
> > > > > Hello,
> > > > >
> > > > > My script in the local mode works perfectly. The same script does
> not
> > > work
> > > > > in the mapreduce mode. For the local mode, the o/p is saved in the
> > > current
> > > > > directory, where as for the mapreduce mode I use /scrach directory