Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Re: How can I record some position of context in Reduce()?


Copy link to this message
-
Re: How can I record some position of context in Reduce()?
Vikas Jadhav 2013-04-10, 13:11
How are you going to support NON EQUI Join using MapReduce ?
As per my understanding there is only one way to do this is
to bring all data to one reducer then reducer will know lesser/greater
values correctly.
Correct me if I am wrong.
Thank You.

*  Regards,*
*  Vikas *

On Wed, Apr 10, 2013 at 4:22 PM, Michel Segel <[EMAIL PROTECTED]>wrote:

> Can you show an example of your join?
> All joins are an equality in that the key has to match.
> Whether its a one to one , one to many, or many to many remains to be seen.
>
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On Apr 9, 2013, at 10:35 AM, Effyroth Gu <[EMAIL PROTECTED]> wrote:
>
> Only equality joins, outer joins, and left semi joins are supported in
> Hive. Hive does not support join conditions that are not equality
> conditions as it is very difficult to express such conditions as a
> map/reduce job. Also, more than two tables can be joined in Hive.
>
>
> 2013/4/9 Michael Segel <[EMAIL PROTECTED]>
>
>> Hi,
>>
>> Your cross join is supported in both pig and hive. (Cross, and Theta
>> joins)
>>
>> So there must be code to do this.
>>
>> Essentially in the reducer you would have your key and then the set of
>> rows that match the key. You would then perform the cross product on the
>> key's result set and output them to the collector as separate rows.
>>
>> I'm not sure why you would need the reduce context.
>>
>> But then again, I'm still on my first cup of coffee. ;-)
>>
>>
>> On Apr 9, 2013, at 12:15 AM, Vikas Jadhav <[EMAIL PROTECTED]>
>> wrote:
>>
>> Hi
>> I am also woring on join using MapReduce
>> i think instead of finding postion of table in RawKeyValuIterator.
>> what we can do modify context.write method to alway write key as table
>> name or id
>> then we dont need to find postion we can get Key and Value from
>> "reducerContext"
>>
>> befor calling reducer.run(reducerContext) in ReduceTask.java we can  add
>> method join in Reducer.java Reducer class and give call to
>> reducer.join(reduceContext)
>>
>>
>> I just wonder how r going to support NON EQUI join.
>>
>> I am also having same problem how to do join if datasets cant fit in to
>> memory.
>>
>>
>> for now i am cloning using following code :
>>
>>
>> KEYIN key = context.getCurrentKey() ;
>> KEYIN outKey = null;
>> try {
>>     outKey = (KEYIN)key.getClass().newInstance();
>>    }
>> catch(Exception e)
>>  {}
>> ReflectionUtils.copy(context.getConfiguration(), key, outKey);
>>
>>  Iterable<VALUEIN> values = context.getValues();
>>  ArrayList<VALUEIN> myValues = new ArrayList<VALUEIN>();
>>  for(VALUEIN value: values) {
>>    VALUEIN outValue = null;
>>     try {
>>          outValue = (VALUEIN)value.getClass().newInstance();
>>    }
>>    catch(Exception e)    {}
>>    ReflectionUtils.copy(context.getConfiguration(), value, outValue);
>>  }
>>
>>
>> if you have found any other solution please feel free to share
>>
>> Thank You.
>>
>>
>>
>>
>>
>>
>> On Thu, Mar 14, 2013 at 1:53 PM, Roth Effy <[EMAIL PROTECTED]> wrote:
>>
>>> In reduce() we have:
>>>
>>> key1 values1
>>> key2 values2
>>> ...
>>> keyn valuesn
>>>
>>> so,what i want to do is join all values like a SQL:
>>>
>>> select * from values1,values2...valuesn;
>>>
>>> if memory is not enough to cache values,how to complete the join
>>> operation?
>>> my idea is clone the reducecontext,but it maybe not easy.
>>>
>>> Any help will be appreciated.
>>>
>>>
>>> 2013/3/13 Roth Effy <[EMAIL PROTECTED]>
>>>
>>>> I want a n:n join as Cartesian product,but the DataJoinReducerBase looks
>>>> like only support equal join.
>>>> I want a non-equal join,but I have no idea now.
>>>>
>>>>
>>>> 2013/3/13 Azuryy Yu <[EMAIL PROTECTED]>
>>>>
>>>>> you want a n:n join or 1:n join?
>>>>> On Mar 13, 2013 10:51 AM, "Roth Effy" <[EMAIL PROTECTED]> wrote:
>>>>>
>>>>>> I want to join two table data in reducer.So I need to find the start
>>>>>> of the table.
>>>>>> someone said the DataJoinReducerBase can help me,isn't it?
*
*
*

Thanx and Regards*
* Vikas Jadhav*