Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> pauses during startup (maybe network related?)


Copy link to this message
-
Re: pauses during startup (maybe network related?)
You are spot on about the DNS lookup slowing things down. I've faced
the same issue (before I had a local network DNS set up for the WiFi
network I use).

> but I'm still more just miffed at how it's knowing I'm a 192 address when I told it to use localhost.

There's a few configs you need to additionally change to make a
perfect localhost setup. Otherwise, there are defaults in Apache
Hadoop that bind to 0.0.0.0 and report the current system hostname
(which changes if you get onto a network), causing what you're seeing.

On Fri, May 24, 2013 at 7:42 AM, Ted <[EMAIL PROTECTED]> wrote:
> thanks, I'm almost 100% sure it's network related now.
>
> What I tested was unpluggin my network :), the entire system starts in
> just a few seconds.
>
> I decided to search on "reverse dns" in google and I see other people
> have complained about very slow reverse dns lookups (some related to
> hadoop / hbase too).
>
> I'm not sure why this is happenning yet though. I thought 127.0.0.1 or
> localhost would have just resolved instantly - but it appears it's
> some how finding my real IP instead, i.e. 192.168.1.5 seems to show up
> in the log entries even though all my configurations say
> localhost/127.0.0.1 and my /etc/hosts file has and entry for
> localhost/127.0.0.1
>
> I think if I make a /etc/hosts entry for 192.168.1.5 everything will
> be quick, that's what I'm going to test later. The only problem is I'm
> on an dynamic IP... I've considered just making entries for all
> reasonable permutations like 192.168.1.1 through 192.168.1.20... but
> I'm still more just miffed at how it's knowing I'm a 192 address when
> I told it to use localhost.
>
> On 5/24/13, Chris Nauroth <[EMAIL PROTECTED]> wrote:
>> Hi Ted,
>>
>> 2013-05-23 19:28:19,937 INFO
>> org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names
>> occuring more than 10 times
>> ...
>> 2013-05-23 19:28:26,801 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 28 on 9000: starting
>>
>> There are a couple of relevant activities that happen during namenode
>> startup in between these 2 log statements.  It loads the current fsimage
>> (persistent copy of file system metadata), merges in the edits log
>> (transaction log containing all file system metadata changes since the last
>> checkpoint), and then saves back a new fsimage file after that merge.
>>  Current versions of the Hadoop codebase will print some information to
>> logs about the volume of activity during this checkpointing process, so I
>> recommend looking for that in your logs to see if this explains it.
>>  Depending on whether or not your have a large number of transactions
>> queued since your last checkpoint, this whole process can cause namenode
>> startup to take several minutes.
>>
>> If this becomes a regular problem, then you can run SecondaryNameNode or
>> BackupNode to perform periodic checkpoints in addition to the checkpoint
>> that occurs on namenode restart.  This is probably overkill for a dev
>> environment on your laptop though.
>>
>> Hope this helps,
>>
>> Chris Nauroth
>> Hortonworks
>> http://hortonworks.com/
>>
>>
>>
>> On Thu, May 23, 2013 at 2:49 AM, Ted <[EMAIL PROTECTED]> wrote:
>>
>>> Hi I'm running hadoop on my local laptop for development and
>>> everything "works" but there's some annoying pauses during the startup
>>> which causes the entire hadoop startup process to take up to 4 minutes
>>> and I'm wondering what it is and if I can do anything about it.
>>>
>>> I'm running everything on 1 machines, on fedora linux, hadoop-1.1.2,
>>> oracle jkd1.7.0_17, the machine is a dual core i5, and I have 8gb of
>>> ram and an SSD so it shouldn't be slow.
>>>
>>> When the system pauses, there is no cpu usage, no disk usage and no
>>> network usage (although I suspect it's waiting for the network to
>>> resolve or return something).
>>>
>>> Here's some snippets from the namenode logs during startup where you
>>> can see it just pauses for around 30 seconds or more with out errors

Harsh J