Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #56657 > unrolled thread
| Started by | Stephen Tucker <stephen_tucker@sil.org> |
|---|---|
| First post | 2013-10-11 09:16 +0100 |
| Last post | 2013-10-11 17:06 +0000 |
| Articles | 2 — 2 participants |
Back to article view | Back to comp.lang.python
Unicode Objects in Tuples Stephen Tucker <stephen_tucker@sil.org> - 2013-10-11 09:16 +0100
Re: Unicode Objects in Tuples Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-10-11 17:06 +0000
| From | Stephen Tucker <stephen_tucker@sil.org> |
|---|---|
| Date | 2013-10-11 09:16 +0100 |
| Subject | Unicode Objects in Tuples |
| Message-ID | <mailman.989.1381479460.18130.python-list@python.org> |
[Multipart message — attachments visible in raw view] — view raw
I am using IDLE, Python 2.7.2 on Windows 7, 64-bit.
I have four questions:
1. Why is it that
print unicode_object
displays non-ASCII characters in the unicode object correctly, whereas
print (unicode_object, another_unicode_object)
displays non-ASCII characters in the unicode objects as escape sequences
(as repr() does)?
2. Given that this is actually *deliberately *the case (which I, at the
moment, am finding difficult to accept), what is the neatest (that is, the
most Pythonic) way to get non-ASCII characters in unicode objects in tuples
displayed correctly?
3. A similar thing happens when I write such objects and tuples to a file
opened by
codecs.open ( ..., "utf-8")
I have also found that, even though I use write to send the text to the
file, unicode objects not in tuples get their non-ASCII characters sent to
the file correctly, whereas, unicode objects in tuples get their characters
sent to the file as escape sequences. Why is this the case?
4. As for question 1 above, I ask here also: What is the neatest way to get
round this?
Stephen Tucker.
[toc] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2013-10-11 17:06 +0000 |
| Message-ID | <52582ffd$0$29984$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #56657 |
On Fri, 11 Oct 2013 09:16:36 +0100, Stephen Tucker wrote:
> I am using IDLE, Python 2.7.2 on Windows 7, 64-bit.
>
> I have four questions:
>
> 1. Why is it that
> print unicode_object
> displays non-ASCII characters in the unicode object correctly, whereas
> print (unicode_object, another_unicode_object)
> displays non-ASCII characters in the unicode objects as escape sequences
> (as repr() does)?
Because that is the design of Python. Printing compound objects like
tuples, lists and dicts always uses the repr of the components.
Otherwise, you couldn't tell the difference between (say) (23, 42) and
("23", "42").
If you want something different, you have to do it yourself.
However, having said that, it is true that the repr() of Unicode strings
in Python 2 is rather lame. Python 3 is much better:
[steve@ando ~]$ python2.7 -c "print repr(u'∫ßδЛ')"
u'\xe2\x88\xab\xc3\x9f\xce\xb4\xd0\x9b'
[steve@ando ~]$ python3.3 -c "print(repr('∫ßδЛ'))"
'∫ßδЛ'
So if you have the opportunity to upgrade to Python 3.3, I recommend it.
> 2. Given that this is actually *deliberately *the case (which I, at the
> moment, am finding difficult to accept), what is the neatest (that is,
> the most Pythonic) way to get non-ASCII characters in unicode objects in
> tuples displayed correctly?
I'd go with something like this helper function:
def print_unicode(obj):
if isinstance(obj, (tuple, list, set, frozenset)):
print u', '.join(unicode(item) for item in obj)
else:
print unicode(item)
Adjust to taste :-)
> 3. A similar thing happens when I write such objects and tuples to a
> file opened by
> codecs.open ( ..., "utf-8")
> I have also found that, even though I use write to send the text to
> the file, unicode objects not in tuples get their non-ASCII characters
> sent to the file correctly, whereas, unicode objects in tuples get their
> characters sent to the file as escape sequences. Why is this the case?
Same reason. The default string converter for tuples uses the repr, which
intentionally uses escape sequences. If you want something different, you
can program it yourself.
--
Steven
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web