Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - Tuples in UDF and null


+
Mohit Anchlia 2013-03-07, 00:58
+
Mohit Anchlia 2013-03-07, 18:10
+
Harsha 2013-03-07, 18:31
+
Mohit Anchlia 2013-03-07, 18:35
+
Harsha 2013-03-07, 18:53
+
Mohit Anchlia 2013-03-07, 19:15
+
Harsha 2013-03-07, 19:27
+
Mohit Anchlia 2013-03-07, 19:40
+
Mohit Anchlia 2013-03-07, 19:58
+
Harsha 2013-03-07, 20:11
Copy link to this message
-
Re: Tuples in UDF and null
Mohit Anchlia 2013-03-07, 20:34
yes my mistake. I meant to FLATTEN it and then reference it directly. I'll
look at filter. What I really need is something where I can filter rows
that have UUID followed by either only \t (delims) or \n

On Thu, Mar 7, 2013 at 12:11 PM, Harsha <[EMAIL PROTECTED]> wrote:

> Mohit,
>    A = LOAD '/user/apuser/test/data1' AS b:bag{
> you are naming your data bag as b.
> if you want to refer values inside the data bag try b.a or b.b.
> The sample data I gave you is something random if you are trying to skip
> over nulls
> you can do so by using Filter.
> Take a look at http://pig.apache.org/docs/r0.11.0/
> -Harsha
>
>
> --
> Harsha
>
>
> On Thursday, March 7, 2013 at 11:58 AM, Mohit Anchlia wrote:
>
> > So I did this. I took your example and put it in a file and ran some pig
> > commands through grunt but I am getting same results from a bag and
> > generating from tuple. I might be doing something wrong here.
> >
> > grunt> A = LOAD '/user/apuser/test/data1' AS b:bag{t:tuple(a:chararray,
> > b:chararray)};
> > grunt> dump A;
> > 2013-03-07 14:55:25,125 [main] INFO
> > org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
> input
> > paths to process : 1
> > ({(1,)})
> > ({(3,)})
> > ({(5,10)})
> > ({(7,)})
> >
> > grunt> b = foreach A generate b;
> > grunt> dump b;
> > 2013-03-07 14:57:59,509 [main] INFO
> > org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
> input
> > paths to process : 1
> > ({(1,)})
> > ({(3,)})
> > ({(5,10)})
> > ({(7,)})
> > grunt>
> >
> > I get the same output again.
> >
> >
> > On Thu, Mar 7, 2013 at 11:40 AM, Mohit Anchlia <[EMAIL PROTECTED](mailto:
> [EMAIL PROTECTED])>wrote:
> >
> > > good suggestion. Let me try that
> > >
> > >
> > > On Thu, Mar 7, 2013 at 11:27 AM, Harsha <[EMAIL PROTECTED] (mailto:
> [EMAIL PROTECTED])> wrote:
> > >
> > > > It will be easier if you have some sample data and run it through
> grunt
> > > > shell.
> > > > Lets say you have a dataset like this
> > > > ({(1,)})
> > > > ({(3,)})
> > > > ({(5,10)})
> > > > ({(7,)})
> > > >
> > > > some of them are nulls in your "b" and some rows has values for "b"
> > > > and if you do a "generate" for above it will run through each row
> > > > and try to fetch values for b if there is none it will do ()
> > > > something like this
> > > >
> > > > ({()})
> > > > ({()})
> > > > ({(10)})
> > > > ({()})
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Harsha
> > > >
> > > >
> > > > On Thursday, March 7, 2013 at 11:15 AM, Mohit Anchlia wrote:
> > > >
> > > > > sorry, yes my question was about accessing b not $1. What's the
> effect
> > > > of
> > > > > writing empty() to a file. Say if I did store b into temp then
> should I
> > > > > expect a line or nothing gets writen at all in the file.
> > > > >
> > > > > On Thu, Mar 7, 2013 at 10:53 AM, Harsha <[EMAIL PROTECTED] (mailto:
> [EMAIL PROTECTED]) (mailto:
> > > > [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED]))> wrote:
> > > > >
> > > > > > from your schema b:bag{t:tuple(a:chararray, b:chararray)}
> > > > > > your tuple is inside a bag so on the next line if you are trying
> to
> > > > > >
> > > > >
> > > > >
> > > >
> > > > access
> > > > > > through $1 pig will
> > > > > > throw up an error saying non-existent column.
> > > > > > but if your question is about accessing b than it will print
> empty ()
> > > > > >
> > > > >
> > > >
> > > > if
> > > > > > the there is no value present (as you are setting it as null).
> > > > > >
> > > > > > --
> > > > > > Harsha
> > > > > >
> > > > > >
> > > > > > On Thursday, March 7, 2013 at 10:35 AM, Mohit Anchlia wrote:
> > > > > >
> > > > > > > Thanks! Does "generate" skip over that? if I did b = for B
> generate
> > > > $1
> > > > > > what
> > > > > > > should be expected outcome of alias "b"
> > > > > >
> > > > > >
> > > > >
> > > >
> > > > > > > On Thu, Mar 7, 2013 at 10:31 AM, Harsha <[EMAIL PROTECTED](mailto:
> [EMAIL PROTECTED]) (mailto:
> > > > [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])) (mailto:
>  > > > > > [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED]))> wrote: