Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> A question on joins


Copy link to this message
-
RE: A question on joins
Thanks! Yes, that did the trick. Now I have the following statement:
myresults = join userdeliverydetails by $3, deliverystatuscodes by $0;

I am running pig in local mode.  I have another question.

dump myresults has the following statements at the end. However it does not print the resulting relation (myresults) to my screen. Also, file:/tmp/temp-1750591230/tmp718811728 is a directory with bunch of binary files. Am I missing something?

Success!

Job Stats (time in seconds):
JobId    Alias    Feature    Outputs
job_local_0003    deliverydetails,userdeliverydetails,userdetails    HASH_JOIN

job_local_0004    deliverystatuscodes,myresults    HASH_JOIN    file:/tmp/temp-1750591230/tmp718811728,

Input(s):
Successfully read records from: "file:///Users/sameer/software/pig-0.11.1/bin/DeliveryDetails.txt"
Successfully read records from: "file:///Users/sameer/software/pig-0.11.1/bin/UserDetails.txt"
Successfully read records from: "file:///Users/sameer/software/pig-0.11.1/bin/DeliveryStatusCodes.txt"

Output(s):
Successfully stored records in: "file:/tmp/temp-1750591230/tmp718811728"

Job DAG:
job_local_0003    ->    job_local_0004,
job_local_0004
2013-06-19 07:57:04,599 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2013-06-19 07:57:04,600 [main] WARN  org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2013-06-19 07:57:04,602 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2013-06-19 07:57:04,602 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: A question on joins
> Date: Wed, 19 Jun 2013 00:10:35 -0700
>
> Dear Pig users,
> I am trying to do simple joins by following an example on a Blog. Your help will be great.
>  
> UserDetails.txt
> 123456, Jim
> 456123, Tom
> 789123, Harry
> 789456, Richa
>
> DeliveryDetails.txt
> 123456, 001
> 456123, 002
> 789123, 003
> 789456, 004
>
> DeliveryStatusCodes.txt
> 001, Delivered
> 002, Pending
> 003, Failed
> 004, Resend
>
> Expected o/p
> Jim Delivered
> .....
>
>
>
> grunt> userdetails = load 'UserDetails.txt' as (mobile, username);
>
> grunt> deliverydetails = load 'DeliveryDetails.txt' as (mobile , deliverycode);
>
> grunt> userdeliverydetails = join userdetails BY mobile, deliverydetails BY mobile;
>
> grunt> describe userdeliverydetails
> userdeliverydetails: {userdetails::mobile: bytearray,userdetails::username: bytearray,deliverydetails::mobile: bytearray,deliverydetails::deliverycode: bytearray}
>
> grunt> deliverystatuscodes = load 'DeliveryStatusCodes.txt' as (deliverycode, message);
>
> grunt>  output = join userdeliverydetails by $3, deliverystatuscodes by $0;
>
> 2013-06-19 00:02:31,600 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <line 8, column 0>  mismatched input 'output' expecting EOF
> Details at logfile: pig_1371623265656.log. I have copied the details here:
> -:---F1  *scratch*      All L1     (Fundamental)-----------------------------
> Pig Stack Trace
> ---------------
> ERROR 2997: Encountered IOException. Exception
>
> java.io.IOException: Exception
>         at org.apache.pig.PigServer.getExamples(PigServer.java:1186)
>         at org.apache.pig.tools.grunt.GruntParser.processIllustrate(GruntParser\
> .java:739)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.Illustrate(Pig\
> ScriptParser.java:626)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScrip\
> tParser.java:323)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.\
> java:194)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.\
> java:170)
>         at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
>         at org.apache.pig.Main.run(Main.java:538)
>         at org.apache.pig.Main.main(Main.java:157)

     
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB