Edgewall Software
Modify

Opened 18 years ago

Closed 17 years ago

Last modified 14 years ago

#116 closed defect (wontfix)

on windows (cygwin), bitten-slave sometimes gets into tight loop

Reported by: joel@… Owned by: cmlenz
Priority: critical Milestone: 0.6
Component: Build slave Version: 0.5.3
Keywords: Cc:
Operating System: Windows

Description

I do not yet know what causes this, but sometimes bittem-slave wedges itself on windows, running in a tight loop, printing this output over-and-over again:

[ERROR   ] (113, 'Software caused connection abort')
Traceback (most recent call last):
  File "/tmp/python.572/usr/lib/python2.4/asynchat.py", line 89, in handle_read
    data = self.recv (self.ac_in_buffer_size)
  File "/tmp/python.572/usr/lib/python2.4/asyncore.py", line 343, in recv
    data = self.socket.recv(buffer_size)
error: (113, 'Software caused connection abort')
[ERROR   ] (113, 'Software caused connection abort')
Traceback (most recent call last):
  File "/tmp/python.572/usr/lib/python2.4/asynchat.py", line 219, in initiate_send
    num_sent = self.send (self.ac_out_buffer[:obs])
  File "/tmp/python.572/usr/lib/python2.4/asyncore.py", line 332, in send
    result = self.socket.send(data)
error: (113, 'Software caused connection abort')

The fact that this exception occurs is not necessarily a problem, but bitten-slave getting itself into a tight loop definitely is. Maybe it should just exit when this kind of error happens.

Attachments (2)

beep.py.diff (406 bytes) - added by joel@… 18 years ago.
patch to fix the infinite loop on windows
beep.py.2.diff (667 bytes) - added by jabs@… 18 years ago.
patch for tight loop with windows and python 2.4

Download all attachments as: .zip

Change History (7)

comment:1 Changed 18 years ago by joel@…

Update-- I found a way to easily reproduce this problem. Simply running bitten-slave when the master is not running will cause it.

Comparing this behavior to bitten-slave on Unix systems, it seems that Windows gives a different socket error than Unix does when the connection is broken. On Unix, when I run bitten-slave without the master, I get "connection refused" in the recv function and "broken pipe" in the send function, and the slave properly aborts.

Changed 18 years ago by joel@…

patch to fix the infinite loop on windows

comment:2 Changed 18 years ago by cmlenz

  • Milestone set to 0.6
  • Status changed from new to assigned

Thanks for the patch, much appreciated!

I think this is the same issue reported in #74, so I'll close that one as duplicate (since this one has the patch ;-) ).

comment:3 Changed 18 years ago by jabs@…

I found the same problem, and investigated a bit, since the path does not fix it for me. the difference between #116 and #74 is the message displayed, which stems from different execution paths. The problem seems to be in asyncore and python 2.4

asyncore in 2.4 uses nonblocking sockets, which on windows (no cygwin here!) results in the following situation: connect to nonexisting server returns immediately, and a subsequent select ( as done by poll ) will also return immediately, returning our socket in the except_fd list. this will try to call handle_expt on our object. since Beep Session does not implement that, a warning is written. "warning: unhandled exception" if handle_expt does exist and just raises again, handle_error is called, which in its defautl impl(by asyncore) will print the stack trace above. After that, poll is immediately called again, which leads to the tight loop.

The patch above amends handle_error to raise, which would be ok, but does not get called at all on my tests. In fact, if i try to override handle_expt in Beep.Session, that method does not get called either. (This may be due to the fact that i know little python and messed up somewhere ;-)

btw, python2.3 seems to work ok, it seems to use blocking sockets in asyncore (correct me if wrong) and silently ignores the except_fd set after select (it still fills it before calling???). Most important, no loop is entered, the program just seems to block (on connect ???)

Summary: we should somehow handle_expt(self) in Eventloop (raise/exit) Someone should kick asyncore devs for assuming nonblocking sockets are good for everyone. (I wonder why one can give a timeout param, if its overridden/unused by nonblocking sockets, at least that's the way bsd select works, iirc)

Changed 18 years ago by jabs@…

patch for tight loop with windows and python 2.4

comment:4 Changed 18 years ago by jabs@…

above patch works for me. i have specifically tested for win32, to protect other uses of handle_expt. I did not query exception type, since it is not set to any meaningful value.

comment:5 Changed 17 years ago by cmlenz

  • Resolution set to wontfix
  • Status changed from assigned to closed

BEEP is going away in the next release, being replaced by Master Slave Protocol Http.

Add Comment

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain cmlenz.
The resolution will be deleted. Next status will be 'reopened'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.