Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Join with greater/less then condition


+
sonia gehlot 2012-07-05, 19:21
+
Alan Gates 2012-07-05, 19:43
+
sonia gehlot 2012-07-05, 21:28
Copy link to this message
-
Re: Join with greater/less then condition
Replace the filter with a foreach / generate (replace the right side
with nulls of > condition is not satisfied)

On Thu, Jul 5, 2012 at 2:28 PM, sonia gehlot <[EMAIL PROTECTED]> wrote:
> Actually I wanted to do left outer join, so not sure if filter will work in
> this case.
>
>
> On Thu, Jul 5, 2012 at 12:43 PM, Alan Gates <[EMAIL PROTECTED]> wrote:
>
>> Pig can only do equi-joins.  Theta joins are hard in MapReduce.  So the
>> way to do this is do the equi-join and then filter afterwards.  This will
>> not create significant additional cost since the join results will be
>> filtered before being materialized to disk.
>>
>> C = Join table_a on user_id, title_id, table_b on user_id, title_id;
>> D = filter C by table_a::timestamp > table_b::timestamp;
>>
>> Alan.
>>
>> On Jul 5, 2012, at 12:21 PM, sonia gehlot wrote:
>>
>> > Hi Guys,
>> >
>> > I want to join 2 tables in hive on couple of columns and out them one
>> > condition is timestamp of one column is greater then the other one. In
>> SQL
>> > I could have written in this way:
>> >
>> > table_a a Join table_b b
>> > on a.user_id = b.user_id
>> > and a.title_id = b.title_id
>> > and a.timestamp > b.timestamp
>> >
>> > How to write last condition in Pig? *a.timestamp > b.timestamp*
>> >
>> > Thanks,
>> > Sonia
>>
>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB