Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo, mail # user - Using Accumulo as input to a MapReduce job frequently hangs due to lost Zookeeper connection


+
Arjumand Bonhomme 2012-08-16, 07:59
+
Jim Klucar 2012-08-16, 11:22
Copy link to this message
-
Re: Using Accumulo as input to a MapReduce job frequently hangs due to lost Zookeeper connection
Adam Fuchs 2012-08-16, 13:32
That was going to be my suggestion as well, except the zookeeper property
is maxclientcnxns.

Cheers,
Adam
On Aug 16, 2012 7:22 AM, "Jim Klucar" <[EMAIL PROTECTED]> wrote:

> Just shooting from the hip here.
>
> Zookeeper maxclientcxns in zoo.cfg should be increased from the default to
> something like 100. Check the zookeeper log file to see if it is shutting
> down connections.
>
> Check your what your max open files setting is for your OS with 'ulimit
> -n' and increase it if necessary.
>
>
>
>
>
> Sent from my iPhone
>
> On Aug 16, 2012, at 4:00 AM, Arjumand Bonhomme <[EMAIL PROTECTED]> wrote:
>
> Hello,
>
> I'm fairly new to both Accumulo and Hadoop, so I think my problem may be
> due to poor configuration on my part, but I'm running out of ideas.
>
> I'm running this on a mac laptop, with hadoop (hadoop-0.20.2 from cdh3u4)
> in pseudo-distributed mode.
> zookeeper version zookeeper-3.3.5 from cdh3u4
> I'm using the 1.4.1 release of accumulo with a configuration copied from
> "conf/examples/512MB/standalone"
>
> I've got a Map task that is using an accumulo table as the input.
> I'm fetching all rows, but just a single column family, that has hundreds
> or even thousands of different column qualifiers.
> The table has a SummingCombiner installed for the given the column family.
>
> The task runs fine at first, but after ~9-15K records (I print the record
> count to the console every 1K records), it hangs and the following messages
> are printed to the console where I'm running the job:
> 12/08/16 02:57:08 INFO zookeeper.ClientCnxn: Unable to read additional
> data from server sessionid 0x1392cc35b460d1c, likely server has closed
> socket, closing socket connection and attempting reconnect
> 12/08/16 02:57:08 INFO zookeeper.ClientCnxn: Opening socket connection to
> server localhost/fe80:0:0:0:0:0:0:1%1:2181
> 12/08/16 02:57:08 INFO zookeeper.ClientCnxn: Socket connection established
> to localhost/fe80:0:0:0:0:0:0:1%1:2181, initiating session
> 12/08/16 02:57:08 INFO zookeeper.ClientCnxn: Unable to reconnect to
> ZooKeeper service, session 0x1392cc35b460d1c has expired, closing socket
> connection
> 12/08/16 02:57:08 INFO zookeeper.ClientCnxn: EventThread shut down
> 12/08/16 02:57:10 INFO zookeeper.ZooKeeper: Initiating client connection,
> connectString=localhost sessionTimeout=30000
> watcher=org.apache.accumulo.core.zookeeper.ZooSession$AccumuloWatcher@32f5c51c
> 12/08/16 02:57:10 INFO zookeeper.ClientCnxn: Opening socket connection to
> server localhost/0:0:0:0:0:0:0:1:2181
> 12/08/16 02:57:10 INFO zookeeper.ClientCnxn: Socket connection established
> to localhost/0:0:0:0:0:0:0:1:2181, initiating session
> 12/08/16 02:57:10 INFO zookeeper.ClientCnxn: Session establishment
> complete on server localhost/0:0:0:0:0:0:0:1:2181, sessionid > 0x1392cc35b460d25, negotiated timeout = 30000
> 12/08/16 02:57:11 INFO mapred.LocalJobRunner:
> 12/08/16 02:57:14 INFO mapred.LocalJobRunner:
> 12/08/16 02:57:17 INFO mapred.LocalJobRunner:
>
> Sometimes the messages contain a stacktrace like this below:
> 12/08/16 01:57:40 WARN zookeeper.ClientCnxn: Session 0x1392cc35b460b40 for
> server localhost/fe80:0:0:0:0:0:0:1%1:2181, unexpected error, closing
> socket connection and attempting reconnect
> java.io.IOException: Connection reset by peer
>  at sun.nio.ch.FileDispatcher.read0(Native Method)
> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
>  at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:198)
> at sun.nio.ch.IOUtil.read(IOUtil.java:166)
>  at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:245)
> at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:856)
>  at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1154)
> 12/08/16 01:57:40 INFO zookeeper.ClientCnxn: Opening socket connection to
> server localhost/127.0.0.1:2181
> 12/08/16 01:57:40 INFO zookeeper.ClientCnxn: Socket connection established
> to localhost/127.0.0.1:2181, initiating session
> 12/08/16 01:57:40 INFO zookeeper.ClientCnxn: Unable to reconnect to
+
William Slacum 2012-08-16, 11:24
+
Arjumand Bonhomme 2012-08-16, 18:36
+
John Vines 2012-08-16, 19:24
+
Arjumand Bonhomme 2012-08-16, 19:48
+
Arjumand Bonhomme 2012-08-17, 02:10
+
Arjumand Bonhomme 2012-08-20, 17:00
+
Keith Turner 2012-08-20, 17:34
+
David Medinets 2012-08-21, 00:26
+
Keith Turner 2012-08-21, 12:23
+
ameet kini 2012-10-10, 14:22
+
Billie Rinaldi 2012-10-11, 18:57
+
ameet kini 2012-10-17, 14:10
+
ameet kini 2012-10-17, 14:13
+
David Medinets 2012-08-17, 02:33
+
Arjumand Bonhomme 2012-08-17, 03:14