Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - A question on joins


Copy link to this message
-
RE: A question on joins
Sameer Tilak 2013-06-19, 15:03
Thanks! Yes, that did the trick. Now I have the following statement:
myresults = join userdeliverydetails by $3, deliverystatuscodes by $0;

I am running pig in local mode.  I have another question.

dump myresults has the following statements at the end. However it does not print the resulting relation (myresults) to my screen. Also, file:/tmp/temp-1750591230/tmp718811728 is a directory with bunch of binary files. Am I missing something?

Success!

Job Stats (time in seconds):
JobId    Alias    Feature    Outputs
job_local_0003    deliverydetails,userdeliverydetails,userdetails    HASH_JOIN

job_local_0004    deliverystatuscodes,myresults    HASH_JOIN    file:/tmp/temp-1750591230/tmp718811728,

Input(s):
Successfully read records from: "file:///Users/sameer/software/pig-0.11.1/bin/DeliveryDetails.txt"
Successfully read records from: "file:///Users/sameer/software/pig-0.11.1/bin/UserDetails.txt"
Successfully read records from: "file:///Users/sameer/software/pig-0.11.1/bin/DeliveryStatusCodes.txt"

Output(s):
Successfully stored records in: "file:/tmp/temp-1750591230/tmp718811728"

Job DAG:
job_local_0003    ->    job_local_0004,
job_local_0004
2013-06-19 07:57:04,599 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2013-06-19 07:57:04,600 [main] WARN  org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2013-06-19 07:57:04,602 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2013-06-19 07:57:04,602 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: A question on joins
> Date: Wed, 19 Jun 2013 00:10:35 -0700
>
> Dear Pig users,
> I am trying to do simple joins by following an example on a Blog. Your help will be great.
>  
> UserDetails.txt
> 123456, Jim
> 456123, Tom
> 789123, Harry
> 789456, Richa
>
> DeliveryDetails.txt
> 123456, 001
> 456123, 002
> 789123, 003
> 789456, 004
>
> DeliveryStatusCodes.txt
> 001, Delivered
> 002, Pending
> 003, Failed
> 004, Resend
>
> Expected o/p
> Jim Delivered
> .....
>
>
>
> grunt> userdetails = load 'UserDetails.txt' as (mobile, username);
>
> grunt> deliverydetails = load 'DeliveryDetails.txt' as (mobile , deliverycode);
>
> grunt> userdeliverydetails = join userdetails BY mobile, deliverydetails BY mobile;
>
> grunt> describe userdeliverydetails
> userdeliverydetails: {userdetails::mobile: bytearray,userdetails::username: bytearray,deliverydetails::mobile: bytearray,deliverydetails::deliverycode: bytearray}
>
> grunt> deliverystatuscodes = load 'DeliveryStatusCodes.txt' as (deliverycode, message);
>
> grunt>  output = join userdeliverydetails by $3, deliverystatuscodes by $0;
>
> 2013-06-19 00:02:31,600 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <line 8, column 0>  mismatched input 'output' expecting EOF
> Details at logfile: pig_1371623265656.log. I have copied the details here:
> -:---F1  *scratch*      All L1     (Fundamental)-----------------------------
> Pig Stack Trace
> ---------------
> ERROR 2997: Encountered IOException. Exception
>
> java.io.IOException: Exception
>         at org.apache.pig.PigServer.getExamples(PigServer.java:1186)
>         at org.apache.pig.tools.grunt.GruntParser.processIllustrate(GruntParser\
> .java:739)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.Illustrate(Pig\
> ScriptParser.java:626)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScrip\
> tParser.java:323)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.\
> java:194)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.\
> java:170)
>         at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
>         at org.apache.pig.Main.run(Main.java:538)
>         at org.apache.pig.Main.main(Main.java:157)