Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - output


Copy link to this message
-
Re: output
Keren Ouaknine 2013-07-25, 06:02
Got it, thanks Prashant!

On Wed, Jul 24, 2013 at 10:41 PM, Prashant Kommireddi
<[EMAIL PROTECTED]>wrote:

> PigStorage by default uses tab as field delimiter. Is 1.txt tab delimited?
> If not you would need to define space as the delimiter in the constructor
> during the loading - PigStorage(' ').
>
> OR simply edit 1.txt to be tab delimited and your script should work.
>
> The reason you see all empty fields, I believe is that the first field is
> defined double. Without the right delimiter, the entire record is read as
> the first field which cannot be cast to a double (instead it becomes null).
>
> On Wednesday, July 24, 2013, Keren Ouaknine wrote:
>
> > Hi Prashant,
> >
> > Thanks! There's progress as I see the output directories are created but
> > the result not
> > The script:
> > A = load '1.txt'  as (x:double, y:chararray, z:chararray);
> > dump A;
> > store A into '/home/kereno/Documents/pig-0.11.1/workspace/res-dump2'
> USING
> > PigStorage (',');
> >
> > My file 1.txt looks like this:
> > ~/Documents/pig-0.11.1/workspace 0$ more 1.txt
> > 1 a aleph
> > 2 b bet
> > 3 g gimel
> >
> > I am getting in part-m-0000 the following:
> >  ~/Documents/pig-0.11.1/workspace 0$ more res-dump2/part-m-00000
> > ,,
> > ,,
> > ,,
> >
> > [these are all commas]
> >
> > When I ran the query from grunt the output files were stored on hdfs (I
> > guess that's the default behavior?)
> > Same output of course :)
> >
> > My question is how come the dump doesn't show the content of 1.txt?
> > By the way, my ultimate purpose is to test corner case of union but I am
> > trying to get simple dumps to work first :)
> >
> > Thanks,
> > Keren
> >
> > On Wed, Jul 24, 2013 at 8:22 PM, Prashant Kommireddi <
> [EMAIL PROTECTED]<javascript:;>
> > >wrote:
> >
> > > Can you paste your pig script here? Are you using STORE to write output
> > to
> > > a certain directory?
> http://pig.apache.org/docs/r0.11.1/basic.html#store
> > >
> > >
> > > On Wed, Jul 24, 2013 at 7:47 PM, Keren Ouaknine <[EMAIL PROTECTED]
> <javascript:;>>
> > wrote:
> > >
> > > > Hello,
> > > >
> > > > I am running a Pig script and completing successfully to run my
> script:
> > > >
> > > > Input(s):
> > > > Successfully read 3 records (388 bytes) from:
> > > > "hdfs://localhost:54310/user/kereno/1.txt"
> > > >
> > > > *Output(s):*
> > > > *Successfully stored 3 records (9 bytes) in:
> > > > "/home/kereno/Documents/pig-0.11.1/workspace/res-dump2"*
> > > >
> > > > *Problem:* When I lookup for the output file (res-dump2) - there is
> no
> > > such
> > > > directory.
> > > > Same problem when I ran the script or when I used grunt. My query
> > loads a
> > > > simple file and dumps it!!
> > > >
> > > > Any ideas what could be the issue?
> > > > Running on Hadoop apache v1.2 and Pig 11.1
> > > >
> > > > Thanks,
> > > > Keren
> > > >
> > > > --
> > > > Keren Ouaknine
> > > > Web: www.kereno.com
> > > >
> > >
> >
> >
> >
> > --
> > Keren Ouaknine
> > Web: www.kereno.com
> >
>

--
Keren Ouaknine
Web: www.kereno.com