Edgewall Software

Opened 15 years ago

Last modified 12 years ago

#380 closed defect

When the build for *one* revision hangs, the slave hangs forever, and builds aren't triggered anymore — at Version 1

Reported by: edgewall.org@… Owned by: cmlenz
Priority: critical Milestone: 0.6
Component: General Version: dev
Keywords: Cc: felix.schwarz@…
Operating System: BSD

Description (last modified by dfraser)

Someone in our teams scr*wed up and committed some tests that are hanging the build. We corrected after a few commits, but now, when the slaves do the build, they hang when they get to build the revisions that hang, and we have to restart them by hand when there's a new revision to build...

Unchecking "Build all revisions" doesn't seem to change any of this behavior. Actually, This option isn't really doing what I would expect: it should be called "Trigger a build for every commit, even if it is not on the path for the configuration" or something like that. It would be good that there's a "Only build latest revision" option.

The problem is worse for me, as I build two different configurations: and Bitten tries to build *all the revisions* for one of the configurations, before building the other. Since the slaves hang at some of the revisions for the first configurations, the second configuration is never built - for any revision. I cannot get this configuration to be built, at all.

I'm currently trying to patch 'queue.py', to simply skip the builds that are causing trouble: diff -u queue.py queue.py-original --- queue.py 2009-04-02 09:02:12.000000000 -0700 +++ queue.py-original 2009-04-02 09:01:00.000000000 -0700 @@ -221,9 +221,6 @@

platforms = [] for platform, rev, build in collect_changes(repos, config, db):

  • if rev > 1710 and rev < 1726:
  • continue

-

if not self.build_all and platform.id in platforms:

# We've seen this platform already, so these are older # builds that should only be built if built_all=True

I think it will work, but there may be a cleaner way of doing it ?

The *slave* code should have a way to stop a build if it doesn't finish before the timeout defined in the admin (currently this timeout is only used on the master). I've looked at the code in the slave, it doesn't seem too difficult to implement a control thread that would stop the build. However, i'm not familiar enough with threading in Python to do it...

Change History (1)

comment:1 Changed 15 years ago by dfraser

  • Description modified (diff)

Agreed, I've had lots of trouble with this before. Useful things would be:

  • Timeout limit for the slave Even nicer, remember the previous execution times and adjust the timeout limit to be N*previous-longest-successful-execution
  • The ability to mark certain revisions as not-for-testing (perhaps only on certain platforms)
Note: See TracTickets for help on using tickets.