Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> wild card for all fields in a tuple


Copy link to this message
-
Re: wild card for all fields in a tuple
Yeah, that works great. Thanks Jonathan and Alan. I can see that all fields
in between feature will be totally useful for some cases.

On Wed, Jan 12, 2011 at 3:33 PM, Alan Gates <[EMAIL PROTECTED]> wrote:

> Jonathan is right, you can do all fields in a tuple with *.  I was thinking
> of doing all fields in between two fields, which you can't do yet.
>
> Alan.
>
>
> On Jan 12, 2011, at 3:18 PM, Alan Gates wrote:
>
>  There isn't a way to do that yet.  See
>> https://issues.apache.org/jira/browse/PIG-1693
>>  for our plans on adding it in the next release.
>>
>> Alan.
>>
>> On Jan 12, 2011, at 2:51 PM, Dexin Wang wrote:
>>
>>  Hi,
>>>
>>> Hope there is some simple answer to this. I have bunch of rows, for
>>> each
>>> row, I want to add a column which is derived from some existing
>>> columns. And
>>> I have large number of columns in my input tuple so I don't want to
>>> repeat
>>> the name using "AS" when I generate. Is there an easy way just to
>>> append a
>>> column to tuples without having to touch the tuple itself on the
>>> output.
>>>
>>> Here's my example:
>>>
>>> grunt> DESCRIBE X;
>>> X: {id: chararray,v1: int,v2: int}
>>>
>>> grunt> DUMP X;
>>> (a,3,42)
>>> (b,2,4)
>>> (c,7,32)
>>>
>>> I can do this:
>>> grunt> Y = FOREACH X GENERATE (v2 - v1) as diff, id, v1, v2;
>>> grunt> DUMP Y;
>>> (39,a,3,42)
>>> (2,b,2,4)
>>> (25,c,7,32)
>>>
>>> But I would prefer not to have to list all the v's. I may have v1,
>>> v2, v3,
>>> ..., v100.
>>>
>>> Of course this doesn't work
>>>
>>> grunt> Y = FOREACH X GENERATE (v2 - v1) as diff, FLATTEN(X);
>>>
>>> What can be done to simplify this? And related question, what is the
>>> schema
>>> after the FOREACH, I wish I could do a DESCRIBE after FOREACH.
>>>
>>> Thanks !!
>>>
>>
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB