Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - How to mapreduce in the scenario


Copy link to this message
-
Re: How to mapreduce in the scenario
Nitin Pawar 2012-05-29, 10:36
hive is one approach (similar to routine databases but exactly not the same)

if you are looking at mapreduce program then using multipleinput formats
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.html

On Tue, May 29, 2012 at 4:02 PM, Michel Segel <[EMAIL PROTECTED]>wrote:

> Hive?
> Sure.... Assuming you mean that the id is a FK common amongst the tables...
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On May 29, 2012, at 5:29 AM, "liuzhg" <[EMAIL PROTECTED]> wrote:
>
> > Hi,
> >
> > I wonder that if Hadoop can solve effectively the question as following:
> >
> > =========================================> > input file: a.txt, b.txt
> > result: c.txt
> >
> > a.txt:
> > id1,name1,age1,...
> > id2,name2,age2,...
> > id3,name3,age3,...
> > id4,name4,age4,...
> >
> > b.txt:
> > id1,address1,...
> > id2,address2,...
> > id3,address3,...
> >
> > c.txt
> > id1,name1,age1,address1,...
> > id2,name2,age2,address2,...
> > =======================================> >
> > I know that it can be done well by database.
> > But I want to handle it with hadoop if possible.
> > Can hadoop meet the requirement?
> >
> > Any suggestion can help me. Thank you very much!
> >
> > Best Regards,
> >
> > Gump
> >
> >
> >
>

--
Nitin Pawar