Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> More on issue with local vs mapreduce mode


Copy link to this message
-
RE: More on issue with local vs mapreduce mode
Dear Serega,

When I run the script in local mode, I get correct o/p stored in AU/part-m-000 file. However, when I run it in the mapreduce mode (with i/p and o/p from HDFS), the file /scratch/AU/part-m-000 is of size 4 and there is nothing in it.

I am not sure whether AU relation somehow does not get realized correctly or the problem happens during the storing stage.

Here are some of  the console messages:

HadoopVersion    PigVersion    UserId    StartedAt    FinishedAt    Features
1.0.3    0.11.1    p529444    2013-11-06 07:14:15    2013-11-06 07:14:40    UNKNOWN

Success!

Job Stats (time in seconds):
JobId    Maps    Reduces    MaxMapTime    MinMapTIme    AvgMapTime    MedianMapTime    MaxReduceTime    MinReduceTime    AvgReduceTime    MedianReducetime    Alias    Feature    Outputs
job_201311011343_0042    1    0    6    6    6    6    0    0    0    0    A,AU    MULTI_QUERY,MAP_ONLY    /scratch/A,/scratch/AU,

Input(s):
Successfully read 4 records (311082 bytes) from: "/scratch/file.seq"

Output(s):
Successfully stored 4 records (231 bytes) in: "/scratch/A"
Successfully stored 4 records (3 bytes) in: "/scratch/AU"

Counters:
Total records written : 8
Total bytes written : 234
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_201311011343_0042
2013-11-06 07:14:40,352 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!

Here is the o/p of following commands:

hadoop --config $HADOOP_CONF_DIR fs -ls /scratch/AU/part-m-00000

-rw-r--r--   1 username groupname          4 2013-11-06 07:14 /scratch/AU/part-m-00000

I am not sure whether DUMP will give me correct result, but when I replaced store by dump AU in the mapredue mode then I get
AU as:
()
()
()
()

> From: [EMAIL PROTECTED]
> Date: Wed, 6 Nov 2013 11:19:03 +0400
> Subject: Re: More on issue with local vs mapreduce mode
> To: [EMAIL PROTECTED]
>
> "The same script does not work in the mapreduce mode. "
> What does it mean?
>
>
> 2013/11/6 Sameer Tilak <[EMAIL PROTECTED]>
>
> > Hello,
> >
> > My script in the local mode works perfectly. The same script does not work
> > in the mapreduce mode. For the local mode, the o/p is saved in the current
> > directory, where as for the mapreduce mode I use /scrach directory on HDFS.
> >
> > Local mode:
> >
> > A = LOAD 'file.seq' USING SequenceFileLoader AS (key: chararray, value:
> > chararray);
> > DESCRIBE A;
> > STORE A into 'A';
> >
> > AU = FOREACH A GENERATE FLATTEN(parser.customFilter(key,'AAAAA'));
> > STORE AU into 'AU';
> >
> >
> > Mapreduce mode:
> >
> > A = LOAD '/scratch/file.seq' USING SequenceFileLoader AS (key: chararray,
> > value: chararray);
> > DESCRIBE A;
> > STORE A into '/scratch/A';
> >
> > AU = FOREACH A GENERATE FLATTEN(parser.customFilter(key,'AAAAA'));
> > STORE AU into '/scratch/AU';
> >
> > Can someone please point me to tools that I can use to debug the script in
> > mapreduce mode? Also, any thoughts on why this might be happening would be
> > great!
> >
> >
     
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB