Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Tuples in UDF and null


Copy link to this message
-
Re: Tuples in UDF and null
yes my mistake. I meant to FLATTEN it and then reference it directly. I'll
look at filter. What I really need is something where I can filter rows
that have UUID followed by either only \t (delims) or \n

On Thu, Mar 7, 2013 at 12:11 PM, Harsha <[EMAIL PROTECTED]> wrote:

> Mohit,
>    A = LOAD '/user/apuser/test/data1' AS b:bag{
> you are naming your data bag as b.
> if you want to refer values inside the data bag try b.a or b.b.
> The sample data I gave you is something random if you are trying to skip
> over nulls
> you can do so by using Filter.
> Take a look at http://pig.apache.org/docs/r0.11.0/
> -Harsha
>
>
> --
> Harsha
>
>
> On Thursday, March 7, 2013 at 11:58 AM, Mohit Anchlia wrote:
>
> > So I did this. I took your example and put it in a file and ran some pig
> > commands through grunt but I am getting same results from a bag and
> > generating from tuple. I might be doing something wrong here.
> >
> > grunt> A = LOAD '/user/apuser/test/data1' AS b:bag{t:tuple(a:chararray,
> > b:chararray)};
> > grunt> dump A;
> > 2013-03-07 14:55:25,125 [main] INFO
> > org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
> input
> > paths to process : 1
> > ({(1,)})
> > ({(3,)})
> > ({(5,10)})
> > ({(7,)})
> >
> > grunt> b = foreach A generate b;
> > grunt> dump b;
> > 2013-03-07 14:57:59,509 [main] INFO
> > org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
> input
> > paths to process : 1
> > ({(1,)})
> > ({(3,)})
> > ({(5,10)})
> > ({(7,)})
> > grunt>
> >
> > I get the same output again.
> >
> >
> > On Thu, Mar 7, 2013 at 11:40 AM, Mohit Anchlia <[EMAIL PROTECTED](mailto:
> [EMAIL PROTECTED])>wrote:
> >
> > > good suggestion. Let me try that
> > >
> > >
> > > On Thu, Mar 7, 2013 at 11:27 AM, Harsha <[EMAIL PROTECTED] (mailto:
> [EMAIL PROTECTED])> wrote:
> > >
> > > > It will be easier if you have some sample data and run it through
> grunt
> > > > shell.
> > > > Lets say you have a dataset like this
> > > > ({(1,)})
> > > > ({(3,)})
> > > > ({(5,10)})
> > > > ({(7,)})
> > > >
> > > > some of them are nulls in your "b" and some rows has values for "b"
> > > > and if you do a "generate" for above it will run through each row
> > > > and try to fetch values for b if there is none it will do ()
> > > > something like this
> > > >
> > > > ({()})
> > > > ({()})
> > > > ({(10)})
> > > > ({()})
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Harsha
> > > >
> > > >
> > > > On Thursday, March 7, 2013 at 11:15 AM, Mohit Anchlia wrote:
> > > >
> > > > > sorry, yes my question was about accessing b not $1. What's the
> effect
> > > > of
> > > > > writing empty() to a file. Say if I did store b into temp then
> should I
> > > > > expect a line or nothing gets writen at all in the file.
> > > > >
> > > > > On Thu, Mar 7, 2013 at 10:53 AM, Harsha <[EMAIL PROTECTED] (mailto:
> [EMAIL PROTECTED]) (mailto:
> > > > [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED]))> wrote:
> > > > >
> > > > > > from your schema b:bag{t:tuple(a:chararray, b:chararray)}
> > > > > > your tuple is inside a bag so on the next line if you are trying
> to
> > > > > >
> > > > >
> > > > >
> > > >
> > > > access
> > > > > > through $1 pig will
> > > > > > throw up an error saying non-existent column.
> > > > > > but if your question is about accessing b than it will print
> empty ()
> > > > > >
> > > > >
> > > >
> > > > if
> > > > > > the there is no value present (as you are setting it as null).
> > > > > >
> > > > > > --
> > > > > > Harsha
> > > > > >
> > > > > >
> > > > > > On Thursday, March 7, 2013 at 10:35 AM, Mohit Anchlia wrote:
> > > > > >
> > > > > > > Thanks! Does "generate" skip over that? if I did b = for B
> generate
> > > > $1
> > > > > > what
> > > > > > > should be expected outcome of alias "b"
> > > > > >
> > > > > >
> > > > >
> > > >
> > > > > > > On Thu, Mar 7, 2013 at 10:31 AM, Harsha <[EMAIL PROTECTED](mailto:
> [EMAIL PROTECTED]) (mailto:
> > > > [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])) (mailto:
>  > > > > > [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED]))> wrote:
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB