Edgewall Software
Modify

Opened 12 years ago

Closed 11 years ago

#411 closed defect (fixed)

Build slaves should keepalive to indicate activity

Reported by: wbell Owned by: wbell
Priority: major Milestone: 0.6
Component: General Version: 0.6b2
Keywords: Cc: felix.schwarz@…
Operating System:

Description

The slave_timeout configuration variable limits a build to the specified amount of time. Rather than have a fixed timeout which may not work well for all recipes, build slaves should periodically heartbeat to the master indicating their status.

Attachments (1)

0008-Adding-keepalives-to-the-bitten-client-server-protoc.patch (9.8 KB) - added by wbell 11 years ago.
Add keepalives.

Download all attachments as: .zip

Change History (12)

comment:1 Changed 12 years ago by wbell

  • Owner changed from cmlenz to wbell

I have a patch for this. Will attempt to get into the mainline.

comment:2 follow-up: Changed 12 years ago by osimons

Also partly related to #380?

comment:3 in reply to: ↑ 2 Changed 12 years ago by wbell

Replying to osimons:

Also partly related to #380?

Yes. One of the things my patch does is when a build that is currently being built is invalidated, the slave running it stops when it next keepalives. This seems like the best behavior.

comment:4 Changed 12 years ago by osimons

  • Milestone changed from 0.6 to 0.6.1

comment:5 Changed 11 years ago by Felix Schwarz <felix.schwarz@…>

  • Cc felix.schwarz@… added

comment:6 Changed 11 years ago by hodgestar

Patch which may resolve this attached to #222.

comment:7 Changed 11 years ago by hodgestar

  • Milestone changed from 0.6.1 to 0.6
  • Operating System BSD deleted
  • Version changed from 0.5.3 to 0.6b2

Changed 11 years ago by wbell

Add keepalives.

comment:8 follow-up: Changed 11 years ago by wbell

The patch on #222 helps some-- it puts the invalidation timeout limit measured from the last step.

Here's a patch that puts active keepalives in play for the master and slave. The meaning of the master configuration variable then makes it so that builds for slaves that haven't sent a keepalive in the given window abort-- builds that hang now stay running forever in the results, whereas before, they'd exceed the timeout, be dropped from the ui (while still running on the slave.)

This would chew through all our build slaves with a bad build, and there'd be no way to tell they were all stuck from the ui.

comment:9 in reply to: ↑ 8 ; follow-up: Changed 11 years ago by osimons

Replying to wbell:

Here's a patch that puts active keepalives in play for the master and slave.

I haven't tried the patch yet, but it reads well I think. Nice work. In theory, we will not make db or protocol changes to branches after the main release (ie. no high impact changes). So, if you want this in for 0.6 then please make sure it is committed in time for an upcoming 0.6 beta/rc release as I'd like to see it tested across a variety of platforms and Python versions. If not, bump it to the 0.7 milestone.

Had a quick word with hodgestar on IRC about the patch, and we're fine with either target - you decide, Walter.

comment:10 in reply to: ↑ 9 Changed 11 years ago by wbell

Replying to osimons:

Replying to wbell:

Here's a patch that puts active keepalives in play for the master and slave.

I haven't tried the patch yet, but it reads well I think. Nice work. In theory, we will not make db or protocol changes to branches after the main release (ie. no high impact changes). So, if you want this in for 0.6 then please make sure it is committed in time for an upcoming 0.6 beta/rc release as I'd like to see it tested across a variety of platforms and Python versions. If not, bump it to the 0.7 milestone.

Had a quick word with hodgestar on IRC about the patch, and we're fine with either target - you decide, Walter.

Sounds good. I'll commit today. I'll update some of the protocol checking so that you can continue to run an older client against a newer server, and worst case, if there's an issue with the keepalive thread (the most likely source for issues), we'll just roll that portion back and leave the server keepalive-capable.

comment:11 Changed 11 years ago by wbell

  • Resolution set to fixed
  • Status changed from new to closed

Fixed with [863]

Add Comment

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain wbell.
The resolution will be deleted. Next status will be 'reopened'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.