Opened 19 years ago
Closed 17 years ago
#100 closed defect (fixed)
When multiple slaves are processing the same builds, the master throws errors
Reported by: | anonymous | Owned by: | cmlenz |
---|---|---|---|
Priority: | major | Milestone: | 0.6 |
Component: | Build master | Version: | 0.5.1 |
Keywords: | Cc: | ||
Operating System: |
Description
If you manage to get 2 slaves working on the same build (either through [95] or by invalidating the build while it's going) you'll get exceptions like the following:
2006-01-22 22:31:00,759 [bitten.beep] ERROR: columns build, name are not unique Traceback (most recent call last): File "d:\Python23\lib\asyncore.py", line 69, in read obj.handle_read_event() File "d:\Python23\lib\asyncore.py", line 390, in handle_read_event self.handle_read() File "d:\Python23\lib\asynchat.py", line 136, in handle_read self.found_terminator() File "build\bdist.win32\egg\bitten\util\beep.py", line 278, in found_terminator File "build\bdist.win32\egg\bitten\util\beep.py", line 311, in _handle_frame File "build\bdist.win32\egg\bitten\util\beep.py", line 469, in handle_data_frame File "build\bdist.win32\egg\bitten\master.py", line 238, in handle_reply File "build\bdist.win32\egg\bitten\master.py", line 293, in _build_step_completed File "build\bdist.win32\egg\bitten\model.py", line 577, in insert File "d:\Python23\lib\site-packages\sqlite\main.py", line 255, in execute self.rs = self.con.db.execute(SQL % parms) IntegrityError: columns build, name are not unique
The master should check for any build step completion if the slave is the slave that should be doing the build, and throw out the result otherwise.
Attachments (1)
Change History (8)
comment:1 Changed 19 years ago by wwb2@…
comment:2 Changed 19 years ago by cmlenz
- Resolution set to duplicate
- Status changed from new to closed
This is the same issue reported in #95.
comment:3 Changed 19 years ago by cmlenz
Both reported by you :-P
Can you elaborate why you think this is a different issue?
comment:4 Changed 19 years ago by wwb2@…
Indeed. The current case where this happens with a fix for #95 is when you invalidate a build while it's running. The slave processing the build (which has been invalidated) will continue to process it. If another slave (slave #2) comes by and picks up the build (start it over), you have 2 guys working on the same build.
When either finishes a step, it reports that back to the master, who tries to write it to the database-- the master doesn't check which guy is the official slave and just writes it. Sometimes you just get weird output in the database (the summaries of the build contain both sets of steps interleaved in different orders) or you get the exception above.
I think the solution is to just check the name of the slave on each step and make sure it's the guy who the master thinks is responsible for the build.
comment:5 Changed 19 years ago by cmlenz
- Milestone set to 0.6
- Resolution duplicate deleted
- Status changed from closed to reopened
Okay, understood. Thanks for the explanation.
comment:6 Changed 17 years ago by anonymous
is this situation handled with [520]?
comment:7 Changed 17 years ago by wbell
- Resolution set to fixed
- Status changed from reopened to closed
I'm not sure the transactionality of the database is quite right-- I think you can still get 2 slaves building the same build, but this fixes the followon effects.
Adding myself so I get notified on a fix.