Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> How to use tuples ?


Copy link to this message
-
Re: How to use tuples ?
I guess you mean to load a bag. Your input file should be:
{(1,2,3),(2,4,5)}
{(2,3,4),(2,3,5)}

And load statement should be:
z = load 'tmp.txt' as (b:{(a0:int,a1:int,a2:int)});

Daniel

On Thu, Feb 2, 2012 at 2:43 AM, praveenesh kumar <[EMAIL PROTECTED]> wrote:
> Okie so its wierd.
>
> I was able to run a pig query using $0.$0
>
> the pig script I wrote for the data (tmp.txt) :
>
> (1,2,3) (2,4,5)
> (2,3,4) (2,3,5)
>
> z = load 'tmp.txt';
> x = foreach z generate $0.$0;
> dump x;
>
> It ran fine for first time. But now its giving me error :
>
> ERROR 1066: Unable to open iterator for alias x
>
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to
> open iterator for alias x
>        at org.apache.pig.PigServer.openIterator(PigServer.java:858)
>        at
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:655)
>        at
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
>        at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188)
>        at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164)
>        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
>        at org.apache.pig.Main.run(Main.java:523)
>        at org.apache.pig.Main.main(Main.java:148)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Caused by: java.io.IOException: Job terminated with anomalous status FAILED
>        at org.apache.pig.PigServer.openIterator(PigServer.java:850)
>        ... 12 more
> ============================>
> On Thu, Feb 2, 2012 at 3:39 PM, praveenesh kumar <[EMAIL PROTECTED]>wrote:
>
>> Okie got it.Thanks for guiding.
>> Without schema. we can refer through $0.$0 or $1.$0 and so on based on the
>> positions..
>>
>> Thanks,
>> Praveenesh
>>
>> On Thu, Feb 2, 2012 at 3:28 PM, praveenesh kumar <[EMAIL PROTECTED]>wrote:
>>
>>> One more thing, suppose I have data  - tmp.txt lie
>>> (1,2,3) (2,4,5)
>>> (2,3,4) (2,3,5)
>>>
>>> So if I will use  Z1 = Load 'tmp.txt'
>>> The data will get stored in a bag (right?)
>>>
>>> ( (1,2,3), (2,4,5) )
>>> ( (2,3,4), (2,3,5) )
>>>
>>> Now I can refer to the fields in this case ( without schema ) ?
>>>
>>> B = Foreach Z1 generate Z1.$0;
>>>
>>> This generates error. How can I do it correctly ?
>>>
>>> Thanks,
>>> Praveenesh
>>>
>>> And if so, how can I refer the variables inside ?
>>>
>>> Thanks,
>>> Praveenesh
>>>
>>>
>>> On Thu, Feb 2, 2012 at 3:10 PM, praveenesh kumar <[EMAIL PROTECTED]>wrote:
>>>
>>>> thanks Daniel,
>>>> so it means for all other complex datatypes, we need the file contents
>>>> to be in that format
>>>> like tuples in ( ), bag in { } , map in [ ]
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Feb 2, 2012 at 2:49 PM, Daniel Dai <[EMAIL PROTECTED]>wrote:
>>>>
>>>>> Hi, Praveenesh,
>>>>> Your tmp.txt should be:
>>>>> (1,2,3,4)
>>>>> (2,3,4,5)
>>>>> (4,5,5,6)
>>>>>
>>>>> And you cannot use "," as a delimit for PigStorage, otherwise,
>>>>> PigStorage will split the line with comma first then parse the tuple.
>>>>>
>>>>> Daniel
>>>>>
>>>>> On Thu, Feb 2, 2012 at 1:05 AM, praveenesh kumar <[EMAIL PROTECTED]>
>>>>> wrote:
>>>>> > Hi,
>>>>> >
>>>>> > I am trying to learn how can I store records in tuples ?
>>>>> >
>>>>> > Suppose I have a txt file
>>>>> >
>>>>> > $ cat tmp.txt
>>>>> >
>>>>> > 1,2,3,4
>>>>> > 2,3,4,5
>>>>> > 4,5,5,6
>>>>> >
>>>>> > I am doing this
>>>>> > $ pig > A = Load 'tmp.txt' using PigStorage(',') AS
>>>>> > (t:tuple(int:a,int:b,int:c,int:d));
>>>>> > $ pig > Dump A;
>>>>> > I am getting nothing in the output
>>>>> > ( )
>>>>> > ( )
>>>>> > ( )
>>>>> >
>>>>> > Can anyone help me understanding why its happening ?
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB