Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper, mail # dev - Update on my 1270 testing


Copy link to this message
-
Re: Update on my 1270 testing
Camille Fournier 2011-11-08, 19:01
Anyone know why Patrick's log file might be showing a lot of this
before the error?

2011-11-06 01:02:39,905 [myid:2] - INFO
[Thread-76:NIOServerCnxn$StatCommand@655] - Stat command output

This test never does a stat call, it uses a ZK client to connect in.
This seems strange, perhaps the issue is a test setup one?

C

On Mon, Nov 7, 2011 at 6:23 PM, Patrick Hunt <[EMAIL PROTECTED]> wrote:
> That's fine (direction re 1-4). However my CI branch 3.4 build failed
> over the w/e (once out of four runs). This is AFTER "Preparing for
> release 3.4.0 - take 2" was applied (so testing includes 1270, 1264,
> etc...)
>
> Notice testEarlyLeaderAbandonment is failing. I have attached the log
> file to ZOOKEEPER-1270 JIRA:
> https://issues.apache.org/jira/secure/attachment/12502838/testEarlyLeaderAbandonment5.txt.gz
>
> java.lang.RuntimeException: Waiting too long
>        at org.apache.zookeeper.server.quorum.QuorumPeerMainTest.waitForAll(QuorumPeerMainTest.java:324)
>        at org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testEarlyLeaderAbandonment(QuorumPeerMainTest.java:195)
>        at org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)
>
> Should I reopen 1270, or a new jira, or... ? LMK.
>
> Note - I'm feeling quite ill so I have limited time to provide f/b &
> test for the next day or so.
>
> Patrick
>
> On Sat, Nov 5, 2011 at 12:22 PM, Flavio Junqueira <[EMAIL PROTECTED]> wrote:
>> I'm fine with your proposal. -Flavio
>>
>> On Nov 5, 2011, at 8:15 PM, Camille Fournier wrote:
>>
>>> 2 has been flaky for so long, not sure whether it's worth being a blocker.
>>> The AsyncHammerTests never pass for me locally. Not sure if it's a
>>> problem or not... I am tempted to go with Mahadev on this and get this
>>> 3.4 release out the door. I would be happy to help manage a 3.4.1
>>> release soon thereafter if we find serious issues.
>>>
>>> C
>>>
>>> On Sat, Nov 5, 2011 at 3:01 PM, Flavio Junqueira <[EMAIL PROTECTED]>
>>> wrote:
>>>>
>>>> If 2) is flakey,  we need to fix it, no?
>>>>
>>>> -Flavio
>>>>
>>>> On Nov 5, 2011, at 6:14 PM, Patrick Hunt wrote:
>>>>
>>>>> I ran the 1270-1194 patch continually overnight (trunk) in my ci env,
>>>>> after ~25 test runs I saw 4 failures:
>>>>>
>>>>> 1) #402 - QuorumTest.testFollowersStartAfterLeader
>>>>> 2) #407 - org.apache.zookeeper.test.FLETest.testLE
>>>>> 3) #410 - org.apache.zookeeper.test.AsyncHammerTest.testHammer
>>>>> 4) #415 - org.apache.zookeeper.test.AsyncHammerTest.testHammer
>>>>>
>>>>> 1) client could not connect to reestablished quorum: giving up after
>>>>> 30+ seconds.
>>>>> 2) known flakey test
>>>>> 3) QP failed to shutdown in 30 seconds:
>>>>> QuorumPeer[myid=3]0.0.0.0/0.0.0.0:11224
>>>>> 4) QP failed to shutdown in 30 seconds:
>>>>> QuorumPeer[myid=1]0.0.0.0/0.0.0.0:11222
>>>>>
>>>>> On the plus side no "testearlyleaderabandon" failures.
>>>>>
>>>>> On the minus side 3/4 are a bit worrysome. Searching back through all
>>>>> my previous failures I don't see this happening. Perhaps these changes
>>>>> have shifted some timing? My main concern is that this might be caused
>>>>> directly by the patch itself....
>>>>>
>>>>> Patrick
>>>>
>>>> flavio
>>>> junqueira
>>>>
>>>> research scientist
>>>>
>>>> [EMAIL PROTECTED]
>>>> direct +34 93-183-8828
>>>>
>>>> avinguda diagonal 177, 8th floor, barcelona, 08018, es
>>>> phone (408) 349 3300    fax (408) 349 3301
>>>>
>>>>
>>
>> flavio
>> junqueira
>>
>> research scientist
>>
>> [EMAIL PROTECTED]
>> direct +34 93-183-8828
>>
>> avinguda diagonal 177, 8th floor, barcelona, 08018, es
>> phone (408) 349 3300    fax (408) 349 3301
>>
>>
>