Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Tuples in UDF and null


+
Mohit Anchlia 2013-03-07, 00:58
+
Mohit Anchlia 2013-03-07, 18:10
+
Harsha 2013-03-07, 18:31
+
Mohit Anchlia 2013-03-07, 18:35
+
Harsha 2013-03-07, 18:53
+
Mohit Anchlia 2013-03-07, 19:15
Copy link to this message
-
Re: Tuples in UDF and null
It will be easier if you have some sample data and run it through grunt shell.
Lets say you have a dataset like this
({(1,)})
({(3,)})
({(5,10)})
({(7,)})

some of them are nulls in your "b" and some rows has values for "b"
and if you do a "generate" for above it will run through each row
and try to fetch values for b if there is none it will do ()
something like this

({()})
({()})
({(10)})
({()})
--
Harsha
On Thursday, March 7, 2013 at 11:15 AM, Mohit Anchlia wrote:

> sorry, yes my question was about accessing b not $1. What's the effect of
> writing empty() to a file. Say if I did store b into temp then should I
> expect a line or nothing gets writen at all in the file.
>
> On Thu, Mar 7, 2013 at 10:53 AM, Harsha <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])> wrote:
>
> > from your schema b:bag{t:tuple(a:chararray, b:chararray)}
> > your tuple is inside a bag so on the next line if you are trying to access
> > through $1 pig will
> > throw up an error saying non-existent column.
> > but if your question is about accessing b than it will print empty () if
> > the there is no value present (as you are setting it as null).
> >
> > --
> > Harsha
> >
> >
> > On Thursday, March 7, 2013 at 10:35 AM, Mohit Anchlia wrote:
> >
> > > Thanks! Does "generate" skip over that? if I did b = for B generate $1
> > what
> > > should be expected outcome of alias "b"
> > >
> > > On Thu, Mar 7, 2013 at 10:31 AM, Harsha <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED]) (mailto:
> > [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED]))> wrote:
> > >
> > > > Hi Mohit,
> > > > it won't convert into string literal 'NULL' since its a tuple
> > > > you'll see results like
> > > > ('Hello',)
> > > >
> > > > --
> > > > Harsha
> > > >
> > > >
> > > > On Thursday, March 7, 2013 at 10:10 AM, Mohit Anchlia wrote:
> > > >
> > > > > Any help would be appreciated. I'll also write something shortly and
> > see
> > > > > what happens.
> > > > >
> > > > > On Wed, Mar 6, 2013 at 4:58 PM, Mohit Anchlia <
> > [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])(mailto:
> > > > [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED]))>wrote:
> > >
> >
> > > > >
> > > > > > If I define and set tuple like this:
> > > > > >
> > > > > > Tuple t1 = mTupleFactory.newTuple(2);
> > > > > > t1.set(0, "Hello");
> > > > > > t1.set(1, NULL);
> > > > > >
> > > > > > and have schema like:
> > > > > >
> > > > > > b:bag{t:tuple(a:chararray, b:chararray)
> > > > > >
> > > > > > and then in the pig script if I do:
> > > > > >
> > > > > > page = foreach B generate b;
> > > > > >
> > > > > >
> > > > > >
> > > > > > What should be expected outcome? Would "generate" convert NULL into
> > > > > > literal 'NULL' as a string? Or does it skip over that NULL.
> > > > > >
> > > > >
> > > >
> > >
> >
> >
>
>
>
+
Mohit Anchlia 2013-03-07, 19:40
+
Mohit Anchlia 2013-03-07, 19:58
+
Harsha 2013-03-07, 20:11
+
Mohit Anchlia 2013-03-07, 20:34
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB