Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #90299 > unrolled thread
| Started by | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| First post | 2015-05-10 13:00 -0400 |
| Last post | 2015-05-11 14:00 +1000 |
| Articles | 2 — 2 participants |
Back to article view | Back to comp.lang.python
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: Why does unicode-escape decode escape symbols that are already escaped? Terry Reedy <tjreedy@udel.edu> - 2015-05-10 13:00 -0400
Re: Why does unicode-escape decode escape symbols that are already escaped? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-05-11 14:00 +1000
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2015-05-10 13:00 -0400 |
| Subject | Re: Why does unicode-escape decode escape symbols that are already escaped? |
| Message-ID | <mailman.319.1431277269.12865.python-list@python.org> |
On 5/10/2015 11:53 AM, Somelauw . wrote:
> In Python 3, decoding "€" with unicode-escape returns 'â\x82¬' which in
> my opinion doesn't make sense.
Agreed. I think this is a bug in that it should raise an exception
instead. Decoding a string only makes sense for rot-13
> The € already is decoded; if it were encoded it would look like this:
> '\u20ac'.
> So why is it doing this?
> $ python3 -S
> Python 3.3.3 (default, Nov 27 2013, 17:12:35)
> [GCC 4.8.2] on linux
> >>> import codecs
> >>> codecs.decode('€', 'unicode-escape')
> 'â\x82¬'
> >>> codecs.encode('€', 'unicode-escape')
> b'\\u20ac'
--
Terry Jan Reedy
[toc] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2015-05-11 14:00 +1000 |
| Message-ID | <55502951$0$12997$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #90299 |
On Mon, 11 May 2015 03:00 am, Terry Reedy wrote: > Decoding a string only makes sense for rot-13 Or any other string-to-string encoding. As has been discussed on python-ideas and python-dev many times, the idea of a codec is much more general than just bytes -> string and string -> bytes. It can deal with any transformation of data. The codec machinery can, I believe, operate on any suitable type, and it can certainly operate on bytes -> bytes and str -> str. I have gradually come to agree that bytes and str objects should only support decode() and encode() operations respectively, but str->str and bytes->bytes codecs are useful to. -- Steven
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web