Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #2402 > unrolled thread
| Started by | candide <candide@free.invalid> |
|---|---|
| First post | 2011-04-01 22:55 +0200 |
| Last post | 2011-04-02 15:18 +0200 |
| Articles | 3 — 2 participants |
Back to article view | Back to comp.lang.python
Alphabetics respect to a given locale candide <candide@free.invalid> - 2011-04-01 22:55 +0200
Re: Alphabetics respect to a given locale Emile van Sebille <emile@fenx.com> - 2011-04-01 16:16 -0700
Re: Alphabetics respect to a given locale candide <candide@free.invalid> - 2011-04-02 15:18 +0200
| From | candide <candide@free.invalid> |
|---|---|
| Date | 2011-04-01 22:55 +0200 |
| Subject | Alphabetics respect to a given locale |
| Message-ID | <4d963c2b$0$1584$426a34cc@news.free.fr> |
How to retrieve the list of all characters defined as alphabetic for the current locale ?
[toc] | [next] | [standalone]
| From | Emile van Sebille <emile@fenx.com> |
|---|---|
| Date | 2011-04-01 16:16 -0700 |
| Message-ID | <mailman.111.1301699907.2990.python-list@python.org> |
| In reply to | #2402 |
On 4/1/2011 1:55 PM candide said... > How to retrieve the list of all characters defined as alphabetic for the > current locale ? I think this is supposed to work, but not for whatever reason for me when I try to test after changing my locale (but I think that's a centos thing)... import locale locale.setlocale(locale.LC_ALL,'') import string print string.lowercase I don't see where else this might be for python. However, you can test if something is alpha: >>> val = u'caf' u'\xE9' >>> val.isalpha() True >>> ... and check its unicode category >>> import unicodedata >>> unicodedata.category(u'a') 'Ll' # Letter - lower case >>> unicodedata.category(u'A') 'Lu' # Letter - upper case >>> unicodedata.category(u'1') 'Nd' # Number - decimal? >>> unicodedata.category(u'\x01') 'Cc' # HTH, Emile
[toc] | [prev] | [next] | [standalone]
| From | candide <candide@free.invalid> |
|---|---|
| Date | 2011-04-02 15:18 +0200 |
| Message-ID | <4d972283$0$4785$426a74cc@news.free.fr> |
| In reply to | #2402 |
Le 01/04/2011 22:55, candide a écrit :
> How to retrieve the list of all characters defined as alphabetic for the
> current locale ?
Thanks for the responses. Alas, neither solution works.
Under Ubuntu :
# ----------------------
import string
import locale
print locale.getdefaultlocale()
print locale.getpreferredencoding()
locale.setlocale(locale.LC_ALL, "")
print string.letters
letter_class = u"[" + u"".join(unichr(c) for c in range(0x10000) if
unichr(c).isalpha()) + u"]"
#print letter_class
# ----------------------
prints the following :
('fr_FR', 'UTF8')
UTF-8
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
I commented out the letter_class printing for outputing a flood of
characters not belonging to the usual french character set.
More or less the same problem under Windows, for instance,
string.letters gives the "latin capital letter eth" as an analphabetic
character (this is not the case, we never use this letter in true french
words).
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web