Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> modifying existing wordcount example


+
jamal sasha 2013-01-17, 02:07
+
Chris Embree 2013-01-17, 02:13
Copy link to this message
-
Re: modifying existing wordcount example
Hi,
 Thanks for giving your thoughts.
I was reading some libraries in hadoop.. and i feel like distributed cache
might help me.
but i picked up hadoop very recently (and along it java as well) and i am
not able to think of how to actually code :(
On Wed, Jan 16, 2013 at 6:13 PM, Chris Embree <[EMAIL PROTECTED]> wrote:

> Can you instead copy intput1 and input2 together?
>
> Or process both files on the second pass?
>
> Otherwise, you'll have to read in output file, load the values and start
> your map/red job.
>
> Probably someone else will have a better answer. :)
>
>
> On Wed, Jan 16, 2013 at 9:07 PM, jamal sasha <[EMAIL PROTECTED]>wrote:
>
>> Hi,
>>   In the wordcount example:
>> http://hadoop.apache.org/docs/r0.17.0/mapred_tutorial.html
>>  Lets say I run the above example and save the the output.
>> But lets say that I have now a new input file. What I want to do is..
>> basically again do the wordcount but basically modifying the previous
>> counts.
>> For example..
>> sample_input1.txt  //foo bar foo bar bar bar
>> After first run:
>> 1) foo 2
>> 2) bar 4
>>
>> Save it in output1.txt
>>
>> Now sample_input2.txt //bar hello world
>> Now the result I am looking for is:
>> 1)foo 2
>> 2)bar 5
>> 3) hello 1
>> 4) world 1
>>
>> How do i achieve this in map reduce?
>>
>>
>
+
bejoy.hadoop@... 2013-01-17, 03:30
+
jamal sasha 2013-01-17, 06:40
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB