Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.mixmin.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'attribute': 0.05; 'encoded': 0.05; 'none:': 0.05; 'differently': 0.07; 'executable': 0.07; 'none):': 0.07; 'type,': 0.07; 'bytes)': 0.09; 'encode': 0.09; 'friday,': 0.09; 'non-string': 0.09; 'record.': 0.09; 'self.data': 0.09; 'subject:characters': 0.09; 'to:addr:comp.lang.python': 0.09; 'typeerror:': 0.09; 'cc:addr :python-list': 0.10; 'def': 0.10; ':-)': 0.13; 'encoding': 0.15; 'represents': 0.15; "'type": 0.16; 'delimiters': 0.16; 'equivalents': 0.16; 'former,': 0.16; 'inserting': 0.16; 'latter,': 0.16; 'ordinal': 0.16; 'printing.': 0.16; 'pytest': 0.16; 'stuff.': 0.16; 'string': 0.17; 'wrote:': 0.17; 'bytes': 0.17; 'string,': 0.17; 'unicode': 0.17; '>>>': 0.18; 'code,': 0.18; 'skip:p 30': 0.20; 'are:': 0.20; 'written': 0.20; 'trying': 0.21; 'import': 0.21; '"",': 0.22; 'displayed': 0.22; 'work,': 0.22; "i'd": 0.22; 'cc:2**0': 0.23; "haven't": 0.23; 'cc:addr:python.org': 0.25; 'header:In-Reply-To:1': 0.25; 'header :User-Agent:1': 0.26; 'skip:" 20': 0.26; '(most': 0.27; 'skip:( 20': 0.28; 'record': 0.28; 'behaving': 0.29; 'character': 0.29; 'skip:_ 10': 0.29; 'probably': 0.29; 'class': 0.29; "i'm": 0.29; 'fri,': 0.30; 'returned': 0.30; 'code': 0.31; 'point': 0.31; 'file': 0.32; 'traceback': 0.33; 'handle': 0.33; 'skip:d 20': 0.34; 'version': 0.34; 'received:google.com': 0.34; 'nov': 0.35; 'so,': 0.35; 'pm,': 0.35; 'received:209.85': 0.35; 'but': 0.36; 'characters': 0.36; 'method': 0.36; 'client': 0.36; 'should': 0.36; 'skip:p 20': 0.36; 'display': 0.36; 'does': 0.37; 'why': 0.37; 'received:209': 0.37; 'data': 0.37; 'subject:: ': 0.38; 'mean': 0.38; 'object': 0.38; 'some': 0.38; 'gives': 0.39; 'notice': 0.39; 'your': 0.60; 'between': 0.63; 'taking': 0.65; 'contents.': 0.65; 'goals': 0.78; 'internally.': 0.84; 'do:': 0.91 Newsgroups: comp.lang.python Date: Sun, 11 Nov 2012 05:42:35 -0800 (PST) In-Reply-To: Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=108.78.40.138; posting-account=S-UcDQoAAACh3mXdFBHQR00lNytDt6nm References: <3d4644f8-ab88-41c5-9a52-2a5678dd64c0@googlegroups.com> <99d5bd83-35ab-4801-b953-391c497c35bf@googlegroups.com> User-Agent: G2/1.0 X-Google-Web-Client: true X-Google-IP: 108.78.40.138 MIME-Version: 1.0 Subject: Re: Printing characters outside of the ASCII range From: danielk To: comp.lang.python@googlegroups.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: Python X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Message-ID: Lines: 122 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1352642076 news.xs4all.nl 6891 [2001:888:2000:d::a6]:59705 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:33127 On Friday, November 9, 2012 5:11:12 PM UTC-5, Ian wrote: > On Fri, Nov 9, 2012 at 2:46 PM, danielk wrote: >=20 > > D:\home\python>pytest.py >=20 > > Traceback (most recent call last): >=20 > > File "D:\home\python\pytest.py", line 1, in >=20 > > print(chr(253).decode('latin1')) >=20 > > AttributeError: 'str' object has no attribute 'decode' >=20 > > >=20 > > Do I need to import something? >=20 >=20 >=20 > Ramit should have written "encode", not "decode". But the above still >=20 > would not work, because chr(253) gives you the character at *Unicode* >=20 > code point 253, not the character with CP437 ordinal 253 that your >=20 > terminal can actually print. The Unicode equivalents of those >=20 > characters are: >=20 >=20 >=20 > >>> list(map(ord, bytes([252, 253, 254]).decode('cp437'))) >=20 > [8319, 178, 9632] >=20 >=20 >=20 > So these are what you would need to encode to CP437 for printing. >=20 >=20 >=20 > >>> print(chr(8319)) >=20 > =E2=81=BF >=20 > >>> print(chr(178)) >=20 > =C2=B2 >=20 > >>> print(chr(9632)) >=20 > =E2=96=A0 >=20 >=20 >=20 > That's probably not the way you want to go about printing them, >=20 > though, unless you mean to be inserting them manually. Is the data >=20 > you get from your database a string, or a bytes object? If the >=20 > former, just do: >=20 >=20 >=20 > print(data.encode('cp437')) >=20 >=20 >=20 > If the latter, then it should be printable as is, unless it is in some >=20 > other encoding than CP437. Ian's solution gives me what I need (thanks Ian!). But I notice a differenc= e between '__str__' and '__repr__'. class Pytest(str): def __init__(self, data =3D None): if data =3D=3D None: data =3D "" self.data =3D data def __repr__(self): return (self.data).encode('cp437') >>> import pytest >>> p =3D pytest.Pytest("abc" + chr(178) + "def") >>> print(p) abc=C2=B2def >>> print(p.data) abc=C2=B2def >>> print(type(p.data)) If I change '__repr__' to '__str__' then I get: >>> import pytest >>> p =3D pytest.Pytest("abc" + chr(178) + "def") >>> print(p) Traceback (most recent call last): File "", line 1, in TypeError: __str__ returned non-string (type bytes) Why is '__str__' behaving differently than '__repr__' ? I'd like to be able= to use '__str__' because the result is not executable code, it's just a st= ring of the record contents. The documentation for the 'encode' method says: "Return an encoded version = of the string as a bytes object." Yet when I displayed the type, it said it= was , which I'm taking to be 'type string', or can a 'string'= also be 'a string of bytes' ? I'm trying to get my head around all this codecs/unicode stuff. I haven't h= ad to deal with it until now but I'm determined to not let it get the best = of me :-) My goals are: a) display a 'raw' database record with the delimiters intact, and b) allow the client to create a string that represents a database record. S= o, if they know the record format then they should be able to create a data= base object like it does above, but with the chr(25x) characters. I will ha= ndle the conversion of the chr(25x) characters internally.