Opened 15 years ago
Closed 15 years ago
#411 closed defect (fixed)
Build slaves should keepalive to indicate activity
Reported by: | wbell | Owned by: | wbell |
---|---|---|---|
Priority: | major | Milestone: | 0.6 |
Component: | General | Version: | 0.6b2 |
Keywords: | Cc: | felix.schwarz@… | |
Operating System: |
Description
The slave_timeout configuration variable limits a build to the specified amount of time. Rather than have a fixed timeout which may not work well for all recipes, build slaves should periodically heartbeat to the master indicating their status.
Attachments (1)
Change History (12)
comment:1 Changed 15 years ago by wbell
- Owner changed from cmlenz to wbell
comment:2 follow-up: ↓ 3 Changed 15 years ago by osimons
Also partly related to #380?
comment:3 in reply to: ↑ 2 Changed 15 years ago by wbell
comment:4 Changed 15 years ago by osimons
- Milestone changed from 0.6 to 0.6.1
comment:5 Changed 15 years ago by Felix Schwarz <felix.schwarz@…>
- Cc felix.schwarz@… added
comment:6 Changed 15 years ago by hodgestar
Patch which may resolve this attached to #222.
comment:7 Changed 15 years ago by hodgestar
- Milestone changed from 0.6.1 to 0.6
- Operating System BSD deleted
- Version changed from 0.5.3 to 0.6b2
comment:8 follow-up: ↓ 9 Changed 15 years ago by wbell
The patch on #222 helps some-- it puts the invalidation timeout limit measured from the last step.
Here's a patch that puts active keepalives in play for the master and slave. The meaning of the master configuration variable then makes it so that builds for slaves that haven't sent a keepalive in the given window abort-- builds that hang now stay running forever in the results, whereas before, they'd exceed the timeout, be dropped from the ui (while still running on the slave.)
This would chew through all our build slaves with a bad build, and there'd be no way to tell they were all stuck from the ui.
comment:9 in reply to: ↑ 8 ; follow-up: ↓ 10 Changed 15 years ago by osimons
Replying to wbell:
Here's a patch that puts active keepalives in play for the master and slave.
I haven't tried the patch yet, but it reads well I think. Nice work. In theory, we will not make db or protocol changes to branches after the main release (ie. no high impact changes). So, if you want this in for 0.6 then please make sure it is committed in time for an upcoming 0.6 beta/rc release as I'd like to see it tested across a variety of platforms and Python versions. If not, bump it to the 0.7 milestone.
Had a quick word with hodgestar on IRC about the patch, and we're fine with either target - you decide, Walter.
comment:10 in reply to: ↑ 9 Changed 15 years ago by wbell
Replying to osimons:
Replying to wbell:
Here's a patch that puts active keepalives in play for the master and slave.
I haven't tried the patch yet, but it reads well I think. Nice work. In theory, we will not make db or protocol changes to branches after the main release (ie. no high impact changes). So, if you want this in for 0.6 then please make sure it is committed in time for an upcoming 0.6 beta/rc release as I'd like to see it tested across a variety of platforms and Python versions. If not, bump it to the 0.7 milestone.
Had a quick word with hodgestar on IRC about the patch, and we're fine with either target - you decide, Walter.
Sounds good. I'll commit today. I'll update some of the protocol checking so that you can continue to run an older client against a newer server, and worst case, if there's an issue with the keepalive thread (the most likely source for issues), we'll just roll that portion back and leave the server keepalive-capable.
comment:11 Changed 15 years ago by wbell
- Resolution set to fixed
- Status changed from new to closed
Fixed with [863]
I have a patch for this. Will attempt to get into the mainline.