Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Problem with Pig Store command


Copy link to this message
-
Re: Problem with Pig Store command
I'm not sure then. Maybe ask other ppl for suggestions.

The fact that the output is not absolute seem suspicious, also try using ','
instead of space, did u try

store W into '*/tmp/*wordtesting' using PigStorage(',');

and see if that does the trick?

err, let's see... maybe you're looking at the wrong hadoop cluster? did you
try within the same grunt where you do the above store, do

ls /tmp/wordtesting

and see if that results in something, if so, your hadoop and pig are
pointing to different hadoop clusters.
imo.

On Tue, Sep 21, 2010 at 2:53 PM, Alex Wang <[EMAIL PROTECTED]> wrote:

> Hi hc,
>
> Sorry that I didn't mention it. But load works ok. Here is a portion of the
> output of dump W
>
> (2162,4111,yellow,a)
> (4652,1317,yep,interjection)
> (157,60592,yes,interjection)
> (533,19459,yesterday,adv)
> (265,35058,yet,adv)
> (4040,1626,yield,n)
> (3339,2139,yield,v)
>
> Only the store command is not working...
>
> Alex
>
>
> On Tue, Sep 21, 2010 at 2:48 PM, hc busy <[EMAIL PROTECTED]> wrote:
>
> > probly because load failed.
> >
> > W = load 'wordbag' using PigStorage(' ') as (f1:int, f2:int,
> > name:chararray,
> > type:chararray);
> > T = group W all;
> > U = foreach T generate COUNT(W);
> > dump U;
> >
> > will probably say that the wordbag contained nothing. Debug the loading
> > portion to fix this problem.
> >
> >
> >
> >
> > On Tue, Sep 21, 2010 at 1:50 PM, Alex Wang <[EMAIL PROTECTED]> wrote:
> >
> > > Hi,
> > >
> > >
> > >
> > > I am using pig 0.7.0 in hadoop mapreduce mode.
> > >
> > >
> > >
> > > The problem I have is that I simply can't use
> > >
> > >
> > >
> > > STORE INTO alias USING PigStorage();
> > >
> > >
> > >
> > > I can load dataset in, write UDFs to manipulate the dataset, but I
> can't
> > > store it. The output is a directory in HDFS with 0 bytes.
> > >
> > >
> > >
> > > As an example, I've been testing with a simple script:
> > >
> > >
> > >
> > > W = load 'wordbag' using PigStorage(' ') as (f1:int, f2:int,
> > > name:chararray,
> > > type:chararray);
> > >
> > > store W into 'wordtesting' using PigStorage(' ');
> > >
> > >
> > >
> > > I run the code in grunt, and the output of hadoop fs -ls is:
> > >
> > >
> > >
> > > drwxr-xr-x   - awang supergroup          0 2010-09-21 13:45
> > > /user/awang/wordtesting
> > >
> > >
> > >
> > > The grunt messages are:
> > >
> > >
> > >
> > > grunt> store filteredW into 'wordtesting' using PigStorage(' ');
> > >
> > > 2010-09-21 13:45:35,210 [main] INFO
> > > org.apache.pig.impl.logicalLayer.optimizer.PruneColumns
> > > - No column pruned for W
> > >
> > > 2010-09-21 13:45:35,210 [main] INFO
> > > org.apache.pig.impl.logicalLayer.optimizer.PruneColumns
> > > - No map keys pruned for W
> > >
> > > 2010-09-21 13:45:35,440 [main] INFO
> > > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
> > > - (Name: Store(hdfs://pineal:9000/user/awang/wordtesting:PigStorage('
> '))
> > -
> > > 1-46 Operator Key: 1-46)
> > >
> > > 2010-09-21 13:45:35,498 [main] INFO
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
> > > - MR plan size before optimization: 1
> > >
> > > 2010-09-21 13:45:35,498 [main] INFO
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
> > > - MR plan size after optimization: 1
> > >
> > > 2010-09-21 13:45:35,549 [main] INFO
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> > > - mapred.job.reduce.markreset.buffer.percent is not set, set to default
> > 0.3
> > >
> > > 2010-09-21 13:45:38,100 [main] INFO
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> > > - Setting up single store job
> > >
> > > 2010-09-21 13:45:38,166 [main] INFO
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - 1 map-reduce job(s) waiting for submission.
> > >
> > > 2010-09-21 13:45:38,173 [Thread-15] WARN
> > >  org.apache.hadoop.mapred.JobClient
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB