Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> How to mapreduce in the scenario


+
lzg 2012-05-29, 09:08
Copy link to this message
-
Re: How to mapreduce in the scenario
Yes you can do it.  In pig you would write something like

A = load ‘a.txt’ as (id, name, age, ...)
B = load ‘b.txt’ as (id, address, ...)
C = JOIN A BY id, B BY id;
STORE C into ‘c.txt’

Hive can do it similarly too.  Or you could write your own directly in map/redcue or using the data_join jar.

--Bobby Evans

On 5/29/12 4:08 AM, "lzg" <[EMAIL PROTECTED]> wrote:

Hi,

I wonder that if Hadoop can solve effectively the question as following:

=========================================input file: a.txt, b.txt
result: c.txt

a.txt:
id1,name1,age1,...
id2,name2,age2,...
id3,name3,age3,...
id4,name4,age4,...

b.txt:
id1,address1,...
id2,address2,...
id3,address3,...

c.txt
id1,name1,age1,address1,...
id2,name2,age2,address2,...
=======================================
I know that it can be done well by database.
But I want to handle it with hadoop if possible.
Can hadoop meet the requirement?

Any suggestion can help me. Thank you very much!

Best Regards,

Gump
+
Michel Segel 2012-05-29, 10:32
+
Nitin Pawar 2012-05-29, 10:36
+
liuzhg 2012-05-29, 10:15
+
samir das mohapatra 2012-05-29, 11:33
+
Devaraj k 2012-05-29, 10:40
+
Soumya Banerjee 2012-05-29, 10:53
+
liuzhg 2012-05-30, 01:23
+
Nitin Pawar 2012-05-30, 03:49
+
samir das mohapatra 2012-05-30, 13:32
+
Wilson Wayne - wwilso 2012-05-30, 13:56