Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #107465

Re: Different names for Unicode codepoint

From Eli Zaretskii <eliz@gnu.org>
Newsgroups comp.lang.python
Subject Re: Different names for Unicode codepoint
Date 2016-04-21 22:40 +0300
Message-ID <mailman.27.1461269623.23626.python-list@python.org> (permalink)
References <87wpnqsrzz.fsf@metapensiero.it> <83h9eu699a.fsf@gnu.org>

Show all headers | View raw


> From: Lele Gaifax <lele@metapensiero.it>
> Date: Thu, 21 Apr 2016 21:04:32 +0200
> Cc: python-list@python.org
> 
> is there a particular reason for the slightly different names that Emacs
> (version 25.0.92) and Python (version 3.6.0a0) give to a single Unicode entity?

They don't.

> Just to mention one codepoint, ⋖ is called "LESS THAN WITH DOT" accordingly to
> Emacs' C-x 8 RET TAB menu, while in Python:
> 
>     >>> import unicodedata
>     >>> unicodedata.name('⋖')
>     'LESS-THAN WITH DOT'
>     >>> print("\N{LESS THAN WITH DOT}")
>       File "<stdin>", line 1
>     SyntaxError: (unicode error) ...: unknown Unicode character name

Emacs shows both the "Name" and the "Old Name" properties of
characters as completion candidates, while Python evidently supports
only "Name".  If you type "C-x 8 RET LESS TAB", then you will see
among the completion candidates both "LESS THAN WITH DOT" and
"LESS-THAN WITH DOT".  The former is the "old name" of this character,
according to the Unicode Character Database (which is where Emacs
obtains the names and other properties of characters).

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: Different names for Unicode codepoint Eli Zaretskii <eliz@gnu.org> - 2016-04-21 22:40 +0300

csiph-web