Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper >> mail # user >> Network issues?


Hi,

I have noticed the following pattern in our cluster today:

The leader reports:
 > Unexpected exception causing shutdown while sock still open
 > java.io.EOFException
 > ******* GOODBYE ... ********

All the other server report:
 > Exception when following the leader
 > java.net.SocketTimeoutException: Read timed out

This is a five node cluster running ZK 3.3.3 (yes, it's very old, sorry).

It all happened within the a second across the whole cluster. Does that
sound like a network issue?

As a side effect, the database got corrupted as well. Anybody knows if
this is a known issue in 3.3.3? I checked the release notes and JIRA
tickets but didn't found anything that looks like the pattern we saw.

-Gunnar
--
Gunnar Wagenknecht
[EMAIL PROTECTED]
http://wagenknecht.org/
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB