Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - modifying existing wordcount example


+
jamal sasha 2013-01-17, 02:07
+
Chris Embree 2013-01-17, 02:13
Copy link to this message
-
Re: modifying existing wordcount example
jamal sasha 2013-01-17, 02:54
Hi,
 Thanks for giving your thoughts.
I was reading some libraries in hadoop.. and i feel like distributed cache
might help me.
but i picked up hadoop very recently (and along it java as well) and i am
not able to think of how to actually code :(
On Wed, Jan 16, 2013 at 6:13 PM, Chris Embree <[EMAIL PROTECTED]> wrote:

> Can you instead copy intput1 and input2 together?
>
> Or process both files on the second pass?
>
> Otherwise, you'll have to read in output file, load the values and start
> your map/red job.
>
> Probably someone else will have a better answer. :)
>
>
> On Wed, Jan 16, 2013 at 9:07 PM, jamal sasha <[EMAIL PROTECTED]>wrote:
>
>> Hi,
>>   In the wordcount example:
>> http://hadoop.apache.org/docs/r0.17.0/mapred_tutorial.html
>>  Lets say I run the above example and save the the output.
>> But lets say that I have now a new input file. What I want to do is..
>> basically again do the wordcount but basically modifying the previous
>> counts.
>> For example..
>> sample_input1.txt  //foo bar foo bar bar bar
>> After first run:
>> 1) foo 2
>> 2) bar 4
>>
>> Save it in output1.txt
>>
>> Now sample_input2.txt //bar hello world
>> Now the result I am looking for is:
>> 1)foo 2
>> 2)bar 5
>> 3) hello 1
>> 4) world 1
>>
>> How do i achieve this in map reduce?
>>
>>
>
+
bejoy.hadoop@... 2013-01-17, 03:30
+
jamal sasha 2013-01-17, 06:40