Edgewall Software
Modify

Opened 15 years ago

Last modified 13 years ago

#513 new enhancement

Bitten slave exits too easily in case of an intermittent issue

Reported by: dobesv@… Owned by:
Priority: minor Milestone: 0.6.1
Component: Build slave Version: 0.6b2
Keywords: Cc:
Operating System: Windows

Description

The bitten slave exits if it has any connection issues with the server; for example:

[INFO    ] Build step checkout completed successfully
[DEBUG   ] Sending POST request to 'https://seed2.projectlocker.com/habitsoft/books/trac/builds/38/steps/'
[DEBUG   ] Server returned error 500: Internal Server Error (Internal Server Error

TracError: OSError: [Errno 12] Cannot allocate memory: '/usr/local/lib/python2.5')
[ERROR   ] HTTP Error 500: Internal Server Error
[INFO    ] Slave exited at 2009-12-15 13:57:00

However, this was a temporary issue on the server.

It also exits if I put the windows machine to sleep and then wake it up because its TCP connection may fail.

Ideally it would continue to attempt a build occasionally if the master is down, so that maintenance windows or server capacity problems don't nuke the slaves.

Attachments (0)

Change History (2)

comment:1 Changed 15 years ago by sam.hendley@…

+1, I have this issue if a test hangs and takes too long, when it finally completes the session is expired and the slave exits. I think it should just go back and try another build.

On a related note, when the session is timed out the build should be considered failed as well, currently I have to manually go in and invalidate that build because it appears to still be running.

comment:2 Changed 15 years ago by anonymous

When booting my (vmware virtual) Debain machine the bitten-slave (for testing it is on the same machine as trac) sometimes starts when the IP adress-stuff has not been set up yet or is changed by the system. Since bitten-slave quits upon some errors, it is not possible to run it reliable as a daemon.

ERROR: <urlopen error (-5,'No address associated with hostname')> and then bitten-slave exits.

As a workaround I installed MONIT to supervice just the bitten-slave process, and restart it if needed. I don't feel secure when a networked application like bitten-slave simply quits when there is a temporary network error. I therefore raise the Priority to critical as the program left no trace at all of what was happening in the logs when it couldn't handle the errors (and I had to spend about 5 hours to try to locate an error that happens "sometimes" and not at all when run from the command line)

Add Comment

Modify Ticket

Change Properties
Set your email in Preferences
Action
as new The ticket will remain with no owner.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.