Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Tuples in UDF and null


Copy link to this message
-
Re: Tuples in UDF and null
So I did this. I took your example and put it in a file and ran some pig
commands through grunt but I am getting same results from a bag and
generating from tuple. I might be doing something wrong here.

grunt> A = LOAD '/user/apuser/test/data1' AS b:bag{t:tuple(a:chararray,
b:chararray)};
grunt> dump A;
2013-03-07 14:55:25,125 [main] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input
paths to process : 1
({(1,)})
({(3,)})
({(5,10)})
({(7,)})

grunt> b = foreach A generate b;
grunt> dump b;
2013-03-07 14:57:59,509 [main] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input
paths to process : 1
({(1,)})
({(3,)})
({(5,10)})
({(7,)})
grunt>

I get the same output again.
On Thu, Mar 7, 2013 at 11:40 AM, Mohit Anchlia <[EMAIL PROTECTED]>wrote:

> good suggestion. Let me try that
>
>
> On Thu, Mar 7, 2013 at 11:27 AM, Harsha <[EMAIL PROTECTED]> wrote:
>
>> It will be easier if you have some sample data and run it through grunt
>> shell.
>> Lets say you have a dataset like this
>> ({(1,)})
>> ({(3,)})
>> ({(5,10)})
>> ({(7,)})
>>
>> some of them are nulls in your "b" and some rows has values for "b"
>> and if you do a "generate" for above it will run through each row
>> and try to fetch values for b if there is none it will do ()
>> something like this
>>
>> ({()})
>> ({()})
>> ({(10)})
>> ({()})
>>
>>
>>
>>
>> --
>> Harsha
>>
>>
>> On Thursday, March 7, 2013 at 11:15 AM, Mohit Anchlia wrote:
>>
>> > sorry, yes my question was about accessing b not $1. What's the effect
>> of
>> > writing empty() to a file. Say if I did store b into temp then should I
>> > expect a line or nothing gets writen at all in the file.
>> >
>> > On Thu, Mar 7, 2013 at 10:53 AM, Harsha <[EMAIL PROTECTED] (mailto:
>> [EMAIL PROTECTED])> wrote:
>> >
>> > > from your schema b:bag{t:tuple(a:chararray, b:chararray)}
>> > > your tuple is inside a bag so on the next line if you are trying to
>> access
>> > > through $1 pig will
>> > > throw up an error saying non-existent column.
>> > > but if your question is about accessing b than it will print empty ()
>> if
>> > > the there is no value present (as you are setting it as null).
>> > >
>> > > --
>> > > Harsha
>> > >
>> > >
>> > > On Thursday, March 7, 2013 at 10:35 AM, Mohit Anchlia wrote:
>> > >
>> > > > Thanks! Does "generate" skip over that? if I did b = for B generate
>> $1
>> > > what
>> > > > should be expected outcome of alias "b"
>> > > >
>>  > > > On Thu, Mar 7, 2013 at 10:31 AM, Harsha <[EMAIL PROTECTED] (mailto:
>> [EMAIL PROTECTED]) (mailto:
>> > > [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED]))> wrote:
>> > > >
>> > > > > Hi Mohit,
>> > > > > it won't convert into string literal 'NULL' since its a tuple
>> > > > > you'll see results like
>> > > > > ('Hello',)
>> > > > >
>> > > > > --
>> > > > > Harsha
>> > > > >
>> > > > >
>> > > > > On Thursday, March 7, 2013 at 10:10 AM, Mohit Anchlia wrote:
>> > > > >
>> > > > > > Any help would be appreciated. I'll also write something
>> shortly and
>> > > see
>> > > > > > what happens.
>> > > > > >
>> > > > > > On Wed, Mar 6, 2013 at 4:58 PM, Mohit Anchlia <
>> > > [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])(mailto:
>> > > > > [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED]))>wrote:
>> > > >
>> > >
>> > > > > >
>> > > > > > > If I define and set tuple like this:
>> > > > > > >
>> > > > > > > Tuple t1 = mTupleFactory.newTuple(2);
>> > > > > > > t1.set(0, "Hello");
>> > > > > > > t1.set(1, NULL);
>> > > > > > >
>> > > > > > > and have schema like:
>> > > > > > >
>> > > > > > > b:bag{t:tuple(a:chararray, b:chararray)
>> > > > > > >
>> > > > > > > and then in the pig script if I do:
>> > > > > > >
>> > > > > > > page = foreach B generate b;
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > What should be expected outcome? Would "generate" convert
>> NULL into
>> > > > > > > literal 'NULL' as a string? Or does it skip over that NULL.
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB