Opened 18 years ago
Closed 16 years ago
#151 closed defect (duplicate)
problem with non ascii characters in sh:exec output/error
Reported by: | info@… | Owned by: | cmlenz |
---|---|---|---|
Priority: | major | Milestone: | |
Component: | General | Version: | dev |
Keywords: | Cc: | ||
Operating System: | Windows |
Description
When the output or error of a sh:exec step contains german umlaut characters in latin encoding, I see a parse error exception traceback on the master (sorry did not save this message, version is trunk-r378).
As a workaround a modified _escape_text(text) in bitten/util/xmlio.py:
* 25,32
"""Escape special characters in the provided text so that it can be safely included in XML text nodes. """
! return str(text).replace('&', '&').replace('<', '<') \ ! .replace('>', '>')
def _escape_attr(attr):
"""Escape special characters in the provided text so that it can be safely
--- 25,34 ----
"""Escape special characters in the provided text so that it can be safely included in XML text nodes. """
! return .join([(ord(c) > 127) and ('&#%d;' % ord(c)) or c ! for c in str(text).replace('&', '&') ! .replace('<', '<') ! .replace('>', '>')])
Attachments (0)
Change History (7)
comment:1 follow-up: ↓ 2 Changed 17 years ago by jeberger@…
comment:2 in reply to: ↑ 1 Changed 17 years ago by jeberger@…
comment:3 follow-up: ↓ 4 Changed 16 years ago by anonymous
The workaround didn't work for me! Better is to consider the certain codepage which one uses. So the suggested patch should be this:
return .join([(ord(c) > 127) and ('&#%d;' % ord(c)) or c for c in str(text).decode('cp437') \
.replace('&', '&').replace('<', '<') \ .replace('>', '>')])
comment:4 in reply to: ↑ 3 ; follow-up: ↓ 5 Changed 16 years ago by Jérôme M. Berger <jeberger@…>
Replying to anonymous:
The workaround didn't work for me! Better is to consider the certain codepage which one uses. So the suggested patch should be this:
return .join([(ord(c) > 127) and ('&#%d;' % ord(c)) or c for c in str(text).decode('cp437') \
.replace('&', '&').replace('<', '<') \ .replace('>', '>')])
This assumes that the computer uses the cp467 codepage and won't work otherwise. Plus it won't work if there are non-printable characters in the output (like ANSI escape codes). The proper fix is given with ticket #423 and consists in using cgi.escape along with a call to translate in order to catch any character that cgi.escape misses.
comment:5 in reply to: ↑ 4 Changed 16 years ago by Jérôme M. Berger <jeberger@…>
comment:6 Changed 16 years ago by anonymous
Thanks Jerome! I wonder why your patch doesn't go into the source tree. I think it's important since because of the "Bad Request" errors it makes bitten unusable on systems with umlauts etc.
comment:7 Changed 16 years ago by osimons
- Resolution set to duplicate
- Status changed from new to closed
The workaround is in trunk as of [569] (#243) - it looks like it just strips anything not inside 7-bit ascii.
That should make things work for now, but a final solution would be to support full unicode/utf-8 for transfers. I'll close this ticket as a duplicate of #115 (or really any of the other mentioned tickets).
See also tickets #232 and #243