Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - The Jenkins VMs are increasingly slow / overloaded


Copy link to this message
-
Re: The Jenkins VMs are increasingly slow / overloaded
Andrew Purtell 2013-04-05, 23:48
Also, be careful to differentiate between slaves that are "offline" because
they are in the process of being launched, and those that are offline
because of that bug I mention. (It doesn't happen often but does happen.)
If you kill an "offline" slave being launched, this will just cause churn.
And if this seems like something you don't want to bother with, then just
don't worry about it.

On Fri, Apr 5, 2013 at 4:44 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote:

> This is a bug in the EC2 module for Jenkins. There are other bugs which
> this one fixes so it's not a big deal relative to those. You have an
> account on this system. You can easily go on and delete the slaves which
> end up in offline state.
>
>
> On Fri, Apr 5, 2013 at 4:39 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
>
>> Looks like 4 ECs Jenkins slaves are offline at the moment ...
>>
>>
>> On Wed, Mar 27, 2013 at 1:19 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
>>
>> > Looks like Apache Jenkins went off several times this week.
>> >
>> > Is it difficult to hook up patching test with the new Jenkins ?
>> >
>> > Thanks
>> >
>> >
>> > On Wed, Mar 27, 2013 at 7:49 AM, Andrew Purtell <[EMAIL PROTECTED]
>> >wrote:
>> >
>> >> True, but unlike 0.94 the state of 0.95 and trunk is impacted by
>> Stack's
>> >> wrangling with Maven to find a sane site and assembly, a number of
>> build
>> >> failures are due to that. Also you'll note that prior to yesterday the
>> >> Linux OOM killer was nuking the bloated Maven processes on the build
>> >> slaves. Let's give these builds a bit of time for this stuff to get
>> sorted
>> >> out. The failures in 0.94 seem immediately actionable.
>> >>
>> >>
>> >> On Wed, Mar 27, 2013 at 3:38 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
>> >>
>> >> > Trunk and 0.95 builds are not in good shape.
>> >> > 0.95 builds have been failing for 32 times.
>> >> >
>> >> > On Apache Jenkins, looks like TestAssignmentManagerOnCluster has
>> failed
>> >> > quite often for 0.95 and trunk builds.
>> >> >
>> >> > On Wed, Mar 27, 2013 at 7:18 AM, Andrew Purtell <[EMAIL PROTECTED]
>> >
>> >> > wrote:
>> >> >
>> >> > > In general moving from using the m1.large (2 vcores, 7.5 GB RAM) to
>> >> the
>> >> > > m1.xlarge (4 vcores, 15 GB RAM) instance type for the slaves helped
>> >> with
>> >> > a
>> >> > > build/test timeout, so now I'd about claim the test environment is
>> >> sane.
>> >> > We
>> >> > > are now seeing that replication tests are flapping, occasionally
>> >> timing
>> >> > out
>> >> > > internally:
>> >> > >
>> >> > > See
>> >> > >
>> >> > >
>> >> >
>> >>
>> http://54.241.6.143/job/HBase-0.94/org.apache.hbase$hbase/24/testReport/junit/org.apache.hadoop.hbase.replication/TestReplicationQueueFailoverCompressed/queueFailover/
>> >> > >
>> >> > >
>> >> > > and
>> >> > >
>> >> > >
>> >> >
>> >>
>> http://54.241.6.143/job/HBase-0.94-Security/org.apache.hbase$hbase/7/testReport/junit/org.apache.hadoop.hbase.replication/TestReplicationQueueFailover/queueFailover/
>> >> > >
>> >> > >
>> >> > > The 0.94 and 0.94-security builds are alternating between green and
>> >> red
>> >> > as
>> >> > > a result.
>> >> > >
>> >> > > Perhaps we should reopen/revisit either adjusting the internal
>> >> timeouts
>> >> > for
>> >> > > these tests or the other JIRA about moving minicluster replication
>> >> tests
>> >> > to
>> >> > > hbase-it.
>> >> > >
>> >> > >
>> >> > > On Wed, Mar 27, 2013 at 1:49 AM, Nick Dimiduk <[EMAIL PROTECTED]>
>> >> > wrote:
>> >> > >
>> >> > > > On Tue, Mar 26, 2013 at 1:28 PM, Andrew Purtell <
>> >> [EMAIL PROTECTED]>
>> >> > > > wrote:
>> >> > > >
>> >> > > > > The HBase 0.94 build is now testing green!
>> >> > > > > http://54.241.6.143/job/HBase-0.94/
>> >> > > > >
>> >> > > >
>> >> > > > ^5!
>> >> > > >
>> >> > > > On Tue, Mar 26, 2013 at 1:47 AM, Andrew Purtell <
>> >> [EMAIL PROTECTED]>
>> >> > > > wrote:
>> >> > > > >
>> >> > > > > > I found that Maven was being killed on the slaves by the
>> Linux
>> >> OOM
>> >> > > > killer
>> >> > > > >
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)