Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #100733 > unrolled thread

Re: unicodedata with chr() not the same between python 3.4 and 3.5

Started byChris Angelico <rosuav@gmail.com>
First post2015-12-23 02:42 +1100
Last post2015-12-23 02:42 +1100
Articles 1 — 1 participant

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: unicodedata with chr() not the same between python 3.4 and 3.5 Chris Angelico <rosuav@gmail.com> - 2015-12-23 02:42 +1100

#100733 — Re: unicodedata with chr() not the same between python 3.4 and 3.5

FromChris Angelico <rosuav@gmail.com>
Date2015-12-23 02:42 +1100
SubjectRe: unicodedata with chr() not the same between python 3.4 and 3.5
Message-ID<mailman.61.1450798939.2237.python-list@python.org>
On Wed, Dec 23, 2015 at 2:27 AM, Vincent Davis <vincent@vincentdavis.net> wrote:
> I was expecting the code below to be the same between python3.4 and 3.5. I
> need a mapping between the integers and unicode that is consistant between
> 3.4 and 3.5
>
>>>>
> import unicodedata
>>>>
> u = ''.join(chr(i) for i in range(65536) if (unicodedata.category(chr(i))
> in ('Lu', 'Ll')))[945:965]

Not sure why you're slicing it like this, but it makes little
difference. The significant thing here is that the newer Pythons are
shipping newer Unicode data files, and some code points have changed
category.

rosuav@sikorsky:~$ python3.4
Python 3.4.2 (default, Oct  8 2014, 10:45:20)
[GCC 4.9.1] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import unicodedata
>>> unicodedata.unidata_version
'6.3.0'
>>>
rosuav@sikorsky:~$ python3.5
Python 3.5.0b1+ (default:7255af1a1c50+, May 26 2015, 00:39:06)
[GCC 4.9.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import unicodedata
>>> unicodedata.unidata_version
'7.0.0'
>>>
rosuav@sikorsky:~$ python3.6
Python 3.6.0a0 (default:6e114c4023f5, Dec 20 2015, 19:15:28)
[GCC 4.9.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import unicodedata
>>> unicodedata.unidata_version
'8.0.0'
>>>

Have a read here of what changed in those two major versions:

http://unicode.org/versions/Unicode7.0.0/
http://unicode.org/versions/Unicode8.0.0/

I'm not sure what the best way is to create the mapping you want, but
I would advise freezing it to a specific set of codepoints in your
source code, rather than depending on something external.

ChrisA

[toc] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web