Edgewall Software
Modify

Opened 11 years ago

Closed 9 years ago

#243 closed defect (fixed)

Failure if there are special characters in the build log.

Reported by: jeberger@… Owned by: osimons
Priority: blocker Milestone: 0.6
Component: Build slave Version: 0.5.3
Keywords: Cc: hodgestar
Operating System: Windows

Description

Problem description

If the build log contains special characters (like ANSI color escape sequences for example), the HTTP request used by the slave to notify the master of the completion of the step fails.

Proposed fix

The attached patch removes all non-printable characters from the log before sending it (another solution would be to replace these characters by question marks).

Attachments (4)

bitten-escape-chars.patch (1016 bytes) - added by jeberger@… 11 years ago.
Patch against trunk revision 519
t243-from_utf8_fallback-r908.diff (969 bytes) - added by osimons 9 years ago.
Adds 'replace' fallback to decode (but could also use 'ignore').
slave-tests-r911.diff (4.5 KB) - added by hodgestar 9 years ago.
Tests for non-utf8 characters in build logs
slave-tests-r913.diff (3.4 KB) - added by hodgestar 9 years ago.
Updated slave tests that don't touch Build Slave? itself.

Download all attachments as: .zip

Change History (25)

Changed 11 years ago by jeberger@…

Patch against trunk revision 519

comment:1 follow-up: Changed 11 years ago by jeberger@…

See also tickets #151 and #232

comment:2 in reply to: ↑ 1 Changed 11 years ago by anonymous

Replying to jeberger@free.fr:

See also tickets #151 and #232

And #115.

comment:3 Changed 11 years ago by jeberger@…

  • Summary changed from Failure if there are special characters in the build log. to [PATCH] Failure if there are special characters in the build log.

comment:4 Changed 11 years ago by wbell

  • Resolution set to fixed
  • Status changed from new to closed

Fixed in [569]

comment:5 Changed 10 years ago by osimons

#355 closed as duplicate.

comment:6 Changed 9 years ago by J. Berger <jeberger@…>

  • Operating System changed from Linux to Windows
  • Resolution fixed deleted
  • Status changed from closed to reopened
  • Summary changed from [PATCH] Failure if there are special characters in the build log. to Failure if there are special characters in the build log.

This was broken again in revision [701]. For example iso 8859-1 character 0xE9 ("é") causes the build to fail:

[ERROR   ] Internal error in build step 'build'
Traceback (most recent call last):
  File "C:\Python25\Lib\site-packages\bitten\slave.py", line 335, in _execute_step
    output
  File "C:\Python25\Lib\site-packages\bitten\util\xmlio.py", line 85, in __getitem__
    self.append(node)
  File "C:\Python25\Lib\site-packages\bitten\util\xmlio.py", line 104, in append
    self.children.append(unicode(node))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position 80: ordinal not in range(128)
[WARNING ] Stopping build due to failure

comment:7 follow-up: Changed 9 years ago by J. Berger <jeberger@…>

What the...?!

I tried to work around the problem by replacing "unicode" with "_escape_text" or "str" on line 104 of xmlio.py. This resulted in the slave not being able to run the compilation commands (through "sh:exec"): the error was "executable not found". Why is it able to find the executables with "unicode" and not with "str"? There are no non-ASCII characters in my command lines and the commands displayed in the console are correct. Moreover I only modified the slave, not the master and I thought xmlio.Fragment.append was only used when creating an XML fragment for transmission.

I would have understood a communication error due to the "é" character in the XML, but I expected the build itself to proceed in the same way for both cases...

comment:8 in reply to: ↑ 7 ; follow-up: Changed 9 years ago by J. Berger <jeberger@…>

Replying to J. Berger <jeberger@…>:

I would have understood a communication error due to the "é" character in the XML, but I expected the build itself to proceed in the same way for both cases...

My bad. The build wasn't proceeding in either case because of changeset [784]. I was using <sh:exec executable="foo" .../> and had to replace it with <sh:exec file="foo" .../>.

Moreover, I fixed the problem with non-ascii characters by adding the errors = "replace" argument to the call to unicode on line 104 in utils/xmlio.py:

self.children.append(unicode(node, errors='replace'))

comment:9 Changed 9 years ago by hodgestar

  • Priority changed from major to blocker

We're planning to look at this issue at the Bitten sprint on 24 April.

comment:10 Changed 9 years ago by anatoly techtonik <techtonik@…>

The replacement strategy is wrong on my opinion. All files should be transmitted unaltered in binary form, because they can be later analyzed by other tools. If Slave or Master need to analyze the log and fails - adjustments should be made to the corresponding analyzer component.

comment:11 Changed 9 years ago by anatoly techtonik <techtonik@…>

Patch is good, but we also need test to move this further.

comment:12 Changed 9 years ago by osimons

This needs to be part of 0.6. Need a breaking unittest + a fix. Anyone up for the challenge?

comment:13 Changed 9 years ago by krigstask

Works for me here (on Linux), 0.6b2.

comment:14 Changed 9 years ago by osimons

  • Milestone 0.6 deleted

Works for me too, and no one else has reported such issues for a long time. I have a feeling this is what I fixed in [709].

I'm going to take this out of development schedule, but if someone can provide us with some way to replicate the issue with current trunk (even like a failing unittest) it will be bumped back on top of todo list.

comment:15 in reply to: ↑ 8 Changed 9 years ago by J. Berger <jeberger@…>

I've been unable to create a unit test that reproduces exactly the same exception as in comment:6. However, the following test fails with the current trunk [903]:

    def test_append(self):
        fragment = xmlio.Fragment()
        fragment.append (xmlio.Element ('message', level='info')['\xe9'])

The exception thrown is:

Traceback (most recent call last):
  File "d:\temp\trac\bitten\bitten\util\tests\xmlio.py", line 73, in test_append
    fragment.append (xmlio.Element ('message', level='info')['\xe9'])
  File "d:\temp\trac\bitten\bitten\util\xmlio.py", line 85, in __getitem__
    self.append(node)
  File "d:\temp\trac\bitten\bitten\util\xmlio.py", line 102, in append
    self.children.append(_from_utf8(node))
  File "d:\temp\trac\bitten\bitten\util\xmlio.py", line 32, in _from_utf8
    return text.decode('utf-8')
  File "c:\JBerger\Prog\Python26\lib\encodings\utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe9 in position 0: unexpected end of data

Unfortunately, the fix I suggested in comment:8 doesn't work in this case. I'll need to reproduce it is a real environment to see if I can track it down further, but I don't have the time right now.

comment:16 Changed 9 years ago by osimons

  • Milestone set to 0.6
  • Owner changed from cmlenz to osimons
  • Status changed from reopened to new

I'm thinking we should just use a fallback-stragegy for conversion too. I believe the cases should be much rarer now, but we cannot really know the details of all the various bytes that any tool may output. Better safe than sorry, really...

I've made a patch that uses 'replace' strategy, but I'd be fine with 'ignore' too. I want to add this to 0.6.

Updated patch and test case follows.

Changed 9 years ago by osimons

Adds 'replace' fallback to decode (but could also use 'ignore').

comment:17 Changed 9 years ago by hodgestar

@osimons: I don't think patching _from_utf8() is the correct thing to do here. It's used in a couple of places where I think it's harmful to silently replace or ignore characters (e.g. XML tag and attribute names).

I'm tempted to suggest that users of xmlio.Element and friends are the ones who should ensure the data is either UTF-8 bytes or unicode, not xmlio.Element itself. I'm just not sure how practical that stance is given the way Bitten uses xmlio.

I will shortly attach a new slave test that I think shows that this particular bug is now gone.

Changed 9 years ago by hodgestar

Tests for non-utf8 characters in build logs

comment:18 Changed 9 years ago by osimons

  • Cc hodgestar added

Oh. Thanks for the test - that looks really useful. And, as you say it shows why perhaps my stratey may not be healthy. +1 on adding the tests.

Instead of adding supporting code to slave, could we for unit test perhaps just use the dump_reports option and tweak it slightly so that instead of just doing print output it calls self._do_dump_reports(output) for printing so that we in tests can just override this method and collect the string as we need it? Slightly less verbose in the patch and without adding new features just for unit tests.

Changed 9 years ago by hodgestar

Updated slave tests that don't touch Build Slave? itself.

comment:19 Changed 9 years ago by hodgestar

New tests attached. These don't touch BuildSlave itself at the expense of some hackery to gather the results of the build.

@osimons: I'm not really happy about using the dump_report output because I'm worried that it's the actual construction of the XML that exposes most of the problems related to special characters.

comment:20 Changed 9 years ago by osimons

Tests look good to me. The trickery in tests is well within accepted boundaries... :-) => +1.

comment:21 Changed 9 years ago by hodgestar

  • Resolution set to fixed
  • Status changed from new to closed

Tests committed in r915 and backported to 0.6.x in r916.

Since we now have a test that I think demonstrates that the issue reported in comment 6 has been solved at some point before the test was committed, I'm closing this as fixed.

Add Comment

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain osimons.
The resolution will be deleted. Next status will be 'reopened'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.