Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - More on issue with local vs mapreduce mode


Copy link to this message
-
RE: More on issue with local vs mapreduce mode
Sameer Tilak 2013-11-06, 15:18
Dear Serega,

When I run the script in local mode, I get correct o/p stored in AU/part-m-000 file. However, when I run it in the mapreduce mode (with i/p and o/p from HDFS), the file /scratch/AU/part-m-000 is of size 4 and there is nothing in it.

I am not sure whether AU relation somehow does not get realized correctly or the problem happens during the storing stage.

Here are some of  the console messages:

HadoopVersion    PigVersion    UserId    StartedAt    FinishedAt    Features
1.0.3    0.11.1    p529444    2013-11-06 07:14:15    2013-11-06 07:14:40    UNKNOWN

Success!

Job Stats (time in seconds):
JobId    Maps    Reduces    MaxMapTime    MinMapTIme    AvgMapTime    MedianMapTime    MaxReduceTime    MinReduceTime    AvgReduceTime    MedianReducetime    Alias    Feature    Outputs
job_201311011343_0042    1    0    6    6    6    6    0    0    0    0    A,AU    MULTI_QUERY,MAP_ONLY    /scratch/A,/scratch/AU,

Input(s):
Successfully read 4 records (311082 bytes) from: "/scratch/file.seq"

Output(s):
Successfully stored 4 records (231 bytes) in: "/scratch/A"
Successfully stored 4 records (3 bytes) in: "/scratch/AU"

Counters:
Total records written : 8
Total bytes written : 234
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_201311011343_0042
2013-11-06 07:14:40,352 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!

Here is the o/p of following commands:

hadoop --config $HADOOP_CONF_DIR fs -ls /scratch/AU/part-m-00000

-rw-r--r--   1 username groupname          4 2013-11-06 07:14 /scratch/AU/part-m-00000

I am not sure whether DUMP will give me correct result, but when I replaced store by dump AU in the mapredue mode then I get
AU as:
()
()
()
()

> From: [EMAIL PROTECTED]
> Date: Wed, 6 Nov 2013 11:19:03 +0400
> Subject: Re: More on issue with local vs mapreduce mode
> To: [EMAIL PROTECTED]
>
> "The same script does not work in the mapreduce mode. "
> What does it mean?
>
>
> 2013/11/6 Sameer Tilak <[EMAIL PROTECTED]>
>
> > Hello,
> >
> > My script in the local mode works perfectly. The same script does not work
> > in the mapreduce mode. For the local mode, the o/p is saved in the current
> > directory, where as for the mapreduce mode I use /scrach directory on HDFS.
> >
> > Local mode:
> >
> > A = LOAD 'file.seq' USING SequenceFileLoader AS (key: chararray, value:
> > chararray);
> > DESCRIBE A;
> > STORE A into 'A';
> >
> > AU = FOREACH A GENERATE FLATTEN(parser.customFilter(key,'AAAAA'));
> > STORE AU into 'AU';
> >
> >
> > Mapreduce mode:
> >
> > A = LOAD '/scratch/file.seq' USING SequenceFileLoader AS (key: chararray,
> > value: chararray);
> > DESCRIBE A;
> > STORE A into '/scratch/A';
> >
> > AU = FOREACH A GENERATE FLATTEN(parser.customFilter(key,'AAAAA'));
> > STORE AU into '/scratch/AU';
> >
> > Can someone please point me to tools that I can use to debug the script in
> > mapreduce mode? Also, any thoughts on why this might be happening would be
> > great!
> >
> >