Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #100037 > unrolled thread
| Started by | eryk sun <eryksun@gmail.com> |
|---|---|
| First post | 2015-12-05 07:21 -0600 |
| Last post | 2015-12-05 07:21 -0600 |
| Articles | 1 — 1 participant |
Back to article view | Back to comp.lang.python
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: Unicode failure eryk sun <eryksun@gmail.com> - 2015-12-05 07:21 -0600
| From | eryk sun <eryksun@gmail.com> |
|---|---|
| Date | 2015-12-05 07:21 -0600 |
| Subject | Re: Unicode failure |
| Message-ID | <mailman.223.1449321767.14615.python-list@python.org> |
On Sat, Dec 5, 2015 at 12:10 AM, Chris Angelico <rosuav@gmail.com> wrote: > On Sat, Dec 5, 2015 at 5:06 PM, Terry Reedy <tjreedy@udel.edu> wrote: >> On 12/4/2015 10:22 PM, Random832 wrote: >>> >>> On 2015-12-04, Terry Reedy <tjreedy@udel.edu> wrote: >>>> >>>> Tk widgets, and hence IDLE windows, will print any character from \u0000 >>>> to \uffff without raising, even if the result is blank or �. Higher >>>> codepoints fail, but allowing the entire BMP is better than any Windows >>>> codepage. >>> >>> >>> Well, any bar 1200, 1201, 12000, 12001, 65000, 65001, and 54936. >> >> >> Test before you post. >> >>>>> for cp in 1200, 1201, 12000, 12001, 65000, 65001, 54936: >> print(chr(cp)) >> >> >> Ұ >> ұ >> ⻠ >> ⻡ >> � >> � >> 횘 > > Those numbers aren't codepoints, they're code pages. Specifically, > they're UTF-16, UTF-32, UTF-8, and I'm not sure what 54936 is. Codepage 65000 is UTF-7. Codepage 54936 [1] is GB18030, the official character set of China. It's a UTF superset of GBK. For comparison, codepage 936 is a subset of GBK (it's missing 95 characters) plus the Euro symbol. [1]: https://msdn.microsoft.com/en-us/library/dd317756
Back to top | Article view | comp.lang.python
csiph-web