Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #2402 > unrolled thread

Alphabetics respect to a given locale

Started bycandide <candide@free.invalid>
First post2011-04-01 22:55 +0200
Last post2011-04-02 15:18 +0200
Articles 3 — 2 participants

Back to article view | Back to comp.lang.python


Contents

  Alphabetics respect to a given locale candide <candide@free.invalid> - 2011-04-01 22:55 +0200
    Re: Alphabetics respect to a given locale Emile van Sebille <emile@fenx.com> - 2011-04-01 16:16 -0700
    Re: Alphabetics respect to a given locale candide <candide@free.invalid> - 2011-04-02 15:18 +0200

#2402 — Alphabetics respect to a given locale

Fromcandide <candide@free.invalid>
Date2011-04-01 22:55 +0200
SubjectAlphabetics respect to a given locale
Message-ID<4d963c2b$0$1584$426a34cc@news.free.fr>
How to retrieve the list of all characters defined as alphabetic for the 
current locale  ?

[toc] | [next] | [standalone]


#2416

FromEmile van Sebille <emile@fenx.com>
Date2011-04-01 16:16 -0700
Message-ID<mailman.111.1301699907.2990.python-list@python.org>
In reply to#2402
On 4/1/2011 1:55 PM candide said...
> How to retrieve the list of all characters defined as alphabetic for the
> current locale ?

I think this is supposed to work, but not for whatever reason for me 
when I try to test after changing my locale (but I think that's a centos 
thing)...

import locale
locale.setlocale(locale.LC_ALL,'')
import string
print string.lowercase

I don't see where else this might be for python.

However, you can test if something is alpha:

 >>> val = u'caf' u'\xE9'
 >>> val.isalpha()
True
 >>>

... and check its unicode category

 >>> import unicodedata
 >>> unicodedata.category(u'a')
'Ll' # Letter - lower case
 >>> unicodedata.category(u'A')
'Lu' # Letter - upper case
 >>> unicodedata.category(u'1')
'Nd' # Number - decimal?
 >>> unicodedata.category(u'\x01')
'Cc' #


HTH,

Emile

[toc] | [prev] | [next] | [standalone]


#2453

Fromcandide <candide@free.invalid>
Date2011-04-02 15:18 +0200
Message-ID<4d972283$0$4785$426a74cc@news.free.fr>
In reply to#2402
Le 01/04/2011 22:55, candide a écrit :
> How to retrieve the list of all characters defined as alphabetic for the
> current locale ?


Thanks for the responses. Alas, neither solution works.

Under Ubuntu :

# ----------------------
import string
import locale

print locale.getdefaultlocale()
print locale.getpreferredencoding()

locale.setlocale(locale.LC_ALL, "")

print string.letters

letter_class = u"[" + u"".join(unichr(c) for c in range(0x10000) if
unichr(c).isalpha()) + u"]"

#print letter_class
# ----------------------

prints the following :


('fr_FR', 'UTF8')
UTF-8
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz


I commented out the letter_class printing for outputing a flood of 
characters not belonging to the usual french character set.


More or less the same problem under Windows, for instance, 
string.letters gives the "latin capital letter eth" as an analphabetic 
character (this is not the case, we never use this letter in true french 
words).


[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web