Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Map phase hanging for wordcount example


Copy link to this message
-
Re: Map phase hanging for wordcount example
The wordcount example, by default, will run a single reducer. This is
controllable by passing -Dmapred.reduce.tasks=2 to your launcher. The
following will work:

hadoop jar hadoop-examples.jar wordcount -Dmapred.reduce.tasks=2 input output

Note that just cause a cluster has N nodes, N reducers aren't
necessary to run. It is not dependent on such things, and is simply a
user-configurable number with the default value of 1.

On Tue, Sep 6, 2011 at 3:23 PM, john smith <[EMAIL PROTECTED]> wrote:
> Yep , it works .. I just synced /etc/hosts files and I didnt change other
> configs and now its working fine. Thanks for the help Harsh. Sorry for
> spamming without checking my TTlogs properly.
>
> Also 1 more doubt . Any idea why its scheduling only a single reduce? I have
> 2 datanodes and I am expecting it to run 2 reducers (data size of 500MB) .
>
> Any hints?
>
>
> On Tue, Sep 6, 2011 at 3:17 PM, Harsh J <[EMAIL PROTECTED]> wrote:
>
>> John,
>>
>> Yes, looks like your slave nodes aren't able to properly resolve some
>> hostnames. Hadoop requires a sane network setup to work properly.
>> Also, yes, you need to use a hostname for your fs.default.name and
>> other configs to the extent possible.
>>
>> The easiest way is to keep a properly synchronized /etc/hosts file.
>>
>> For example, it may look like so, on all machines:
>>
>> 127.0.0.1 localhost.localdomain localhost
>> 192.168.0.1 master.hadoop master
>> 192.168.0.2 slave3.hadoop slave3
>> (and so on…)
>>
>> (This way master can resolve slaves, and slaves can resolve master. If
>> you have the time, setup a DNS, its the best thing to do.)
>>
>> Then, in core-site.xml you'll need:
>>
>> fs.default.name = hdfs://master
>>
>> And in mapred-site.xml:
>>
>> mapred.job.tracker = master:8021
>>
>> That should do it, so long as the slave hosts can freely access the
>> master hosts (no blockage of ports via firewall and such).
>>
>> On Tue, Sep 6, 2011 at 3:05 PM, john smith <[EMAIL PROTECTED]> wrote:
>> > Hey My TT logs show this ,
>> >
>> > 2011-09-06 13:22:41,860 ERROR org.apache.hadoop.mapred.TaskTracker:
>> Caught
>> > exception: java.net.UnknownHostException: unknown host: rip-pc.local
>> > at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:195)
>> > at org.apache.hadoop.ipc.Client.getConnection(Client.java:853)
>> > at org.apache.hadoop.ipc.Client.call(Client.java:723)
>> > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
>> > at $Proxy5.getProtocolVersion(Unknown Source)
>> > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
>> > at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)
>> > at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207)
>> > at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170)
>> > at
>> >
>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)
>> > ^C at
>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
>> > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>> > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
>> > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
>> >
>> >
>> > May be some error in configs ?? I am using IPs in the conf files ..should
>> I
>> > put entries in /etc/hosts files?
>> >
>> > On Tue, Sep 6, 2011 at 3:00 PM, john smith <[EMAIL PROTECTED]>
>> wrote:
>> >
>> >> Hi Harsh,
>> >>
>> >> My jt log : http://pastebin.com/rXAEeDkC
>> >>
>> >> I have some startup exceptions (which doesn't matter much I guess) but
>> the
>> >> tail indicates that its locating the splits correctly and then it hangs
>> !
>> >>
>> >> Any idea?
>> >>
>> >> Thanks
>> >>
>> >>
>> >> On Tue, Sep 6, 2011 at 1:30 PM, Harsh J <[EMAIL PROTECTED]> wrote:
>> >>
>> >>> I'd check the tail of JobTracker logs after a submit is done to see if
>> >>> an error/warn there is causing this. And then dig further on
>> >>> why/what/how.
>> >>>
>> >>> Hard to tell what your problem specifically is without logs :)

Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB