Edgewall Software
Modify

Opened 15 years ago

Closed 12 years ago

#151 closed defect (duplicate)

problem with non ascii characters in sh:exec output/error

Reported by: info@… Owned by: cmlenz
Priority: major Milestone:
Component: General Version: dev
Keywords: Cc:
Operating System: Windows

Description

When the output or error of a sh:exec step contains german umlaut characters in latin encoding, I see a parse error exception traceback on the master (sorry did not save this message, version is trunk-r378).

As a workaround a modified _escape_text(text) in bitten/util/xmlio.py:

* 25,32

"""Escape special characters in the provided text so that it can be safely included in XML text nodes. """

! return str(text).replace('&', '&amp;').replace('<', '&lt;') \ ! .replace('>', '&gt;')

def _escape_attr(attr):

"""Escape special characters in the provided text so that it can be safely

--- 25,34 ----

"""Escape special characters in the provided text so that it can be safely included in XML text nodes. """

! return .join([(ord(c) > 127) and ('&#%d;' % ord(c)) or c ! for c in str(text).replace('&', '&amp;') ! .replace('<', '&lt;') ! .replace('>', '&gt;')])

Attachments (0)

Change History (7)

comment:1 follow-up: Changed 14 years ago by jeberger@…

See also tickets #232 and #243

comment:2 in reply to: ↑ 1 Changed 14 years ago by jeberger@…

Replying to jeberger@free.fr:

See also tickets #232 and #243

And #115.

comment:3 follow-up: Changed 13 years ago by anonymous

The workaround didn't work for me! Better is to consider the certain codepage which one uses. So the suggested patch should be this:

return .join([(ord(c) > 127) and ('&#%d;' % ord(c)) or c for c in str(text).decode('cp437') \

.replace('&', '&amp;').replace('<', '&lt;') \ .replace('>', '&gt;')])

comment:4 in reply to: ↑ 3 ; follow-up: Changed 13 years ago by Jérôme M. Berger <jeberger@…>

Replying to anonymous:

The workaround didn't work for me! Better is to consider the certain codepage which one uses. So the suggested patch should be this:

return .join([(ord(c) > 127) and ('&#%d;' % ord(c)) or c for c in str(text).decode('cp437') \

.replace('&', '&amp;').replace('<', '&lt;') \ .replace('>', '&gt;')])

This assumes that the computer uses the cp467 codepage and won't work otherwise. Plus it won't work if there are non-printable characters in the output (like ANSI escape codes). The proper fix is given with ticket #423 and consists in using cgi.escape along with a call to translate in order to catch any character that cgi.escape misses.

comment:5 in reply to: ↑ 4 Changed 13 years ago by Jérôme M. Berger <jeberger@…>

Replying to Jérôme M. Berger <jeberger@…>:

... ticket #423 ...

Sorry, should be #243

comment:6 Changed 13 years ago by anonymous

Thanks Jerome! I wonder why your patch doesn't go into the source tree. I think it's important since because of the "Bad Request" errors it makes bitten unusable on systems with umlauts etc.

comment:7 Changed 12 years ago by osimons

  • Resolution set to duplicate
  • Status changed from new to closed

The workaround is in trunk as of [569] (#243) - it looks like it just strips anything not inside 7-bit ascii.

That should make things work for now, but a final solution would be to support full unicode/utf-8 for transfers. I'll close this ticket as a duplicate of #115 (or really any of the other mentioned tickets).

Add Comment

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain cmlenz.
The resolution will be deleted. Next status will be 'reopened'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.