Edgewall Software

Ticket #411 (closed defect: fixed)

Opened 4 years ago

Last modified 3 years ago

Build slaves should keepalive to indicate activity

Reported by: wbell Owned by: wbell
Priority: major Milestone: 0.6
Component: General Version: 0.6b2
Keywords: Cc: felix.schwarz@…
Operating System:

Description

The slave_timeout configuration variable limits a build to the specified amount of time. Rather than have a fixed timeout which may not work well for all recipes, build slaves should periodically heartbeat to the master indicating their status.

Attachments

0008-Adding-keepalives-to-the-bitten-client-server-protoc.patch Download (9.8 KB) - added by wbell 3 years ago.
Add keepalives.

Change History

  Changed 4 years ago by wbell

  • owner changed from cmlenz to wbell

I have a patch for this. Will attempt to get into the mainline.

follow-up: ↓ 3   Changed 4 years ago by osimons

Also partly related to #380?

in reply to: ↑ 2   Changed 4 years ago by wbell

Replying to osimons:

Also partly related to #380?

Yes. One of the things my patch does is when a build that is currently being built is invalidated, the slave running it stops when it next keepalives. This seems like the best behavior.

  Changed 4 years ago by osimons

  • milestone changed from 0.6 to 0.6.1

  Changed 3 years ago by Felix Schwarz <felix.schwarz@…>

  • cc felix.schwarz@… added

  Changed 3 years ago by hodgestar

Patch which may resolve this attached to #222.

  Changed 3 years ago by hodgestar

  • version changed from 0.5.3 to 0.6b2
  • os BSD deleted
  • milestone changed from 0.6.1 to 0.6

Changed 3 years ago by wbell

Add keepalives.

follow-up: ↓ 9   Changed 3 years ago by wbell

The patch on #222 helps some-- it puts the invalidation timeout limit measured from the last step.

Here's a patch that puts active keepalives in play for the master and slave. The meaning of the master configuration variable then makes it so that builds for slaves that haven't sent a keepalive in the given window abort-- builds that hang now stay running forever in the results, whereas before, they'd exceed the timeout, be dropped from the ui (while still running on the slave.)

This would chew through all our build slaves with a bad build, and there'd be no way to tell they were all stuck from the ui.

in reply to: ↑ 8 ; follow-up: ↓ 10   Changed 3 years ago by osimons

Replying to wbell:

Here's a patch that puts active keepalives in play for the master and slave.

I haven't tried the patch yet, but it reads well I think. Nice work. In theory, we will not make db or protocol changes to branches after the main release (ie. no high impact changes). So, if you want this in for 0.6 then please make sure it is committed in time for an upcoming 0.6 beta/rc release as I'd like to see it tested across a variety of platforms and Python versions. If not, bump it to the 0.7 milestone.

Had a quick word with hodgestar on IRC about the patch, and we're fine with either target - you decide, Walter.

in reply to: ↑ 9   Changed 3 years ago by wbell

Replying to osimons:

Replying to wbell:

Here's a patch that puts active keepalives in play for the master and slave.

I haven't tried the patch yet, but it reads well I think. Nice work. In theory, we will not make db or protocol changes to branches after the main release (ie. no high impact changes). So, if you want this in for 0.6 then please make sure it is committed in time for an upcoming 0.6 beta/rc release as I'd like to see it tested across a variety of platforms and Python versions. If not, bump it to the 0.7 milestone. Had a quick word with hodgestar on IRC about the patch, and we're fine with either target - you decide, Walter.

Sounds good. I'll commit today. I'll update some of the protocol checking so that you can continue to run an older client against a newer server, and worst case, if there's an issue with the keepalive thread (the most likely source for issues), we'll just roll that portion back and leave the server keepalive-capable.

  Changed 3 years ago by wbell

  • status changed from new to closed
  • resolution set to fixed

Fixed with [863]

Add/Change #411 (Build slaves should keepalive to indicate activity)

Author


E-mail address and user name can be saved in the Preferences.


Change Properties
<Author field>
Action
as closed
The resolution will be deleted. Next status will be 'reopened'
 
Note: See TracTickets for help on using tickets.